This is the documentation for SuperSTAR 9.8

SuperSTAR 9.9 is now available.
View this page in the SuperSTAR 9.9 documentation or visit the SuperSTAR 9.9 documentation home.

Skip to end of metadata
Go to start of metadata

Prerequisites

To use perturbation, you must have:

  • R Keys in the unit records.
  • A perturbation table (PTable) file for each dataset. This file is in CSV format with the extension .pert

Module Properties

The Perturbation module has the following properties you can configure.

PropertyDescription
RKEY
Set this to true to use the R Keys in the unit records.
FREQ
Set this to true to perturb cell values based on the contribution count rather than the cross-tabulation (cell) value.
SmallN
An integer to be used in deriving the lookup values for the perturbation table. Its value can be 10 or below. The default value is 10.
RULESET

Use this option to perturb other results:

  • Set this to the value "PERT()" (or omit the RULESET property altogether) for the default behaviour: perturbation will apply to the cross tabulation results.
  • Use the following optional parameters of PERT() to perturb another result set:
    • The first parameter determines whether to perturb measures. Set this to true (perturb measures) or false (do not perturb measures). This would usually be set to false as perturbation is intended to perturb counts.
    • The second parameter identifies the destination for the perturbation calculation. If not specified this will be the cross tabulation results.
    • The third parameter identifies the source of the perturbation calculation. If not specified this will use the same value as the destination.
  • You can specify multiple RULESET options: separate each one with a | character.

For example:

method "Rule" perturbation addproperty RULESET "PERT()"
The default behaviour. Perturb the cross tabulation results.
method "Rule" perturbation addproperty RULESET "PERT(true)"
Perturb the measures in the cross tabulation results
method "Rule" perturbation addproperty RULESET "PERT(true,RECORD_COUNT)"
Perturb the record count (the output from the record count plugin). See the example below for more details.
method "Rule" perturbation addproperty RULESET "PERT()|PERT(true,RECORD_COUNT)"
Perturb the cross tabulation results and the record count (see the example below).
PTableSize
The size of the perturbation table. The default value is 30.
BigN
An integer to be used as the modulo base when adding R Keys. Must not exceed 2^32 (4294967296). The default value is 4294967296.
ConfidentialityModule

Set this to true to indicate to SuperSERVER that this module applies confidentiality rules. When set to true SuperSERVER will block access to the Record View feature.

Message
A message to be displayed to users in the client.
PTable

The location of the perturbation file for this dataset.

If you do not set a value for the PTable property, then by default SuperSERVER expects this file to be saved in the same location as the SXV4 file, but with the extension .pert instead of .sxv4. For example, if the SXV4 file is C:\ProgramData\STR\SuperSERVER SA\databases\RetailBanking.sxv4 then the perturbation file is expected to be located at C:\ProgramData\STR\SuperSERVER SA\databases\RetailBanking.pert

If you want to use a different location for the file, then you can set the value of PTable to the location of the .pert file. You can either use an absolute path or a relative path (relative to the SuperSERVER program data directory, which is C:\ProgramData\STR\SuperSERVER SA if you installed to the default location).

Any backslashes in the path will need to be escaped with an additional backslash (forward slashes can also be used but do not need to be escaped). For example:

method "Rule" perturbation addproperty PTable "C:\\ProgramData\\STR\\SuperSERVER SA\\databases\\my-ptable.pert"

If the contents of the perturbation file are modified in any way, you must restart SuperSERVER in order for the change to take effect. This is for performance reasons (SuperSERVER caches the perturbation file so that it does not have to reload and parse it on every tabulation).

PropagateZeroes

Whether to propagate zeros across all levels (fact tables) in a given table.

Without this setting, perturbation is not coordinated, so there is a risk that an attacker could exploit this fact to determine that a zero estimate was non zero before perturbation was applied. When this setting is enabled, the perturbation method first perturbs all levels as usual and then sets the corresponding estimates for all other levels in the table to zero, if the estimate for any other level in the table is found to be zero.

This setting also coordinates perturbation with measures: if a fact table count is perturbed to 0, then the measures for that fact table will also be perturbed to 0.

To apply zero propagation, use the following command:

method <method_id> perturbation addproperty PropagateZeroes {"All"|"Ancestor"|"Same"|"None"}

The available settings are:

AncestorPropagation only happens one way: from ancestor to descendant fact tables.
AllPropagation happens both ways to all fact tables.
SamePropagation happens only from the count to the measures within the same fact table.
NoneNo propagation happens (default).

For example:

method "Rule" perturbation addproperty PropagateZeroes "Ancestor"
PropagateZeroesThreshold

The propagation threshold. Use the threshold to control whether a cell can be set to zero by zero propagation from a related level/record count:

method <method_id> perturbation addproperty PropagateZeroesThreshold "<number>"

If the record count of a cell is less than or equal to this threshold, then it can be set to zero by zero propagation.

For example, the following command ensures that cells with record counts of 5 or less can be set to zero:

method "Rule" perturbation addproperty PropagateZeroesThreshold "5"
QUANTILEVALIDATION
QUANTILEPTABLE
QUANTILECONFIG
The location of the configuration files for quantile perturbation. By default, these files should be in the same location as the SXV4 file, but you can use these properties to set an alternative location. See Quantiles and Ranges - Perturbation for more details.

Apply the Plugin

  1. Login to SuperADMIN and create a new method:

    > method addmethod perturbation_method

    This example sets the ID of the new method to perturbation_method. This ID will be used in all the following examples, although you can replace this with your preferred ID if you wish.

  2. Add the Perturbation Data Control plugin to the method:

    > method perturbation_method adddcplugin perturbation Perturbation

    This example sets the ID of the plugin within this method to perturbation. You can replace this with your preferred ID.

    The Perturbation at the end of this command is the library name for the perturbation module. This is case sensitive and must be specified exactly as shown here.

  3. Set the plugin properties:

    > method perturbation_method perturbation addproperty RKEY "true"
    > method perturbation_method perturbation addproperty FREQ "true"
    > method perturbation_method perturbation addproperty "SmallN" "10"
    > method perturbation_method perturbation addproperty "PTableSize" "30"
    > method perturbation_method perturbation addproperty "BigN" "4294967296"
    > method perturbation_method perturbation addproperty ConfidentialityModule "true"
    > method perturbation_method perturbation addproperty Message "Data has been perturbed"
  4. Assign the method to a dataset (in this example we are assigning the method to a dataset with the ID bank):

    > cat bank addmethod perturbation_method

    You can review the method details using the command cat <dataset_id> methods details <method_id>:

    > cat bank methods details perturbation_method
    [ Method : perturbation_method (id:perturbation_method) (type:mandatory) ]
        [ Common ]
        [ DCPlugin : Perturbation (id:perturbation) (priority:1) ]
            [ RKEY : true ]
            [ FREQ : true ]
            [ SmallN : 10 ]
            [ PTableSize : 30 ]
            [ BigN : 4294967296 ]
            [ ConfidentialityModule : true ]
            [ Message : Data has been perturbed ]

Perturbation with Weighted Datasets

If you have weighted datasets, then you must apply an additional data control module, Average_cellwgt, to your perturbation methods. This module effectively scales up the perturbed amount to account for the weighting.

 How does this module work?

The average cell weighting module calculates the unweighted cell value, applies perturbation to this, and then multiples the result by the average weight of the cell (calculated as the weighted value divided by the unweighted value):

(unweighted count + perturbation factor) * (weighted count / unweighted count)

This ensures that the effect of perturbation is scaled up appropriately to account for the weighting.

When using weighted datasets:

  • The average cell weight module must be added to the method after the perturbation module, as it uses the result of the perturbation as part of its calculation.
  • The FREQ property must be set to true.

The following is a complete example of perturbation with weighted datasets:

method addmethod weighted_perturbation_example

method weighted_perturbation_example adddcplugin weighted_perturbation Perturbation
method weighted_perturbation_example weighted_perturbation addproperty RKEY "true"
method weighted_perturbation_example weighted_perturbation addproperty "SmallN" "10"
method weighted_perturbation_example weighted_perturbation addproperty "PTableSize" "30"
method weighted_perturbation_example weighted_perturbation addproperty "BigN" "4294967296"
method weighted_perturbation_example weighted_perturbation addproperty ConfidentialityModule "true"
method weighted_perturbation_example weighted_perturbation addproperty Message "Data has been perturbed"

method weighted_perturbation_example adddcplugin Average_cellwgt Average_cellwgt

method weighted_perturbation_example common addproperty FREQ "true"
  • No labels