This is the documentation for SuperSTAR 9.8

SuperSTAR 9.9 is now available.
View this page in the SuperSTAR 9.9 documentation or visit the SuperSTAR 9.9 documentation home.

Skip to end of metadata
Go to start of metadata

The SparsityCheck module prevents the release of tables that contain a high proportion of cells with very low values (0,1, or 2). It applies to interior cells only (totals are not included). If SparsityCheck is enabled, each cross-tabulation result is checked to verify that the table is not too sparse for release.

If you need to use SparsityCheck in your deployment, please contact Space-Time Research support (support@spacetimeresearch.com) for advice on the appropriate threshold settings for your processing needs.

To configure the module, you need to define the sparsity check thresholds, which must be named ThresholdA and ThresholdB.

The names are case sensitive, and the default values are:

  • ThresholdA - 0.25
  • ThresholdB - 0.50

The module works as follows:

Assuming that:

  • c is the number of interior cells in the table.
  • c0 is the number of zero interior cells.
  • c1 is the number of interior cells of value 1.
  • c2 is the number of interior cells of value 2.

Then the table will not be released if:

  • c-c0=0 /* table is empty, check first to avoid divide by zero error.
  • c1/(c-c0) > ThresholdA (the ratio of cells with value 1, to the total number of cells with non-zero value).
  • (c1+c2)/(c-c0) > ThresholdB (the ratio of cells with value 1 or 2, to the total number of cells with non-zero value).

Apply the Plugin to a Dataset

  1. Login to SuperADMIN and create a new method:

    > method addmethod sparsity-method
  2. Set the FREQ common property to true (recommended; this will configure SuperSERVER to base the calculation on the contribution count rather than the cross tabulation results).

    > method sparsity-method common addproperty FREQ "true"
  3. Add the Data Control plugin to the method (the name of the plugin, SparsityCheck, is case sensitive):

    > method sparsity-method adddcplugin sparsitycheck SparsityCheck
  4. Set the plugin properties:

    > method sparsity-method sparsitycheck addproperty ThresholdA "0.5"
    > method sparsity-method sparsitycheck addproperty ThresholdB "0.75"
    > method sparsity-method sparsitycheck addproperty Message "Table is too sparse"
    > method sparsity-method sparsitycheck addproperty ConfidentialityModule "true"
  5. Assign the method to a dataset (in this example we are assigning the method to a dataset with the ID bank:

    > cat bank addmethod sparsity-method

    You can review the method details using the command cat <dataset_id> methods details <method_id>:

    > cat bank methods details sparsity-method
    [ Method : sparsity-method (id:sparsity-method) (type:mandatory) ]
        [ Common ]
            [ FREQ : true ]
        [ DCPlugin : SparsityCheck (id:sparsitycheck) (priority:1) ]
            [ ThresholdA : 0.5 ]
            [ ThresholdB : 0.75 ]
            [ Message : Table is too sparse ]
            [ ConfidentialityModule : true ]
  • No labels