To use perturbation, you must have:
- R Keys in the unit records.
- A perturbation table (PTable) file for each dataset. This file is in CSV format with the extension .pert
The Perturbation module has the following properties you can configure.
|Set this to |
|Set this to |
|An integer to be used in deriving the lookup values for the perturbation table. Its value can be 10 or below. The default value is 10.|
Use this option to perturb other results:
|The size of the perturbation table. The default value is 30.|
|An integer to be used as the modulo base when adding R Keys. Must not exceed 2^32 (4294967296). The default value is 4294967296.|
Set this to
|A message to be displayed to users in the client.|
The location of the perturbation file for this dataset.
If you do not set a value for the PTable property, then by default SuperSERVER expects this file to be saved in the same location as the SXV4 file, but with the extension .pert instead of .sxv4. For example, if the SXV4 file is C:\ProgramData\STR\SuperSERVER SA\databases\RetailBanking.sxv4 then the perturbation file is expected to be located at C:\ProgramData\STR\SuperSERVER SA\databases\RetailBanking.pert
If you want to use a different location for the file, then you can set the value of PTable to the location of the .pert file. You can either use an absolute path or a relative path (relative to the SuperSERVER program data directory, which is C:\ProgramData\STR\SuperSERVER SA if you installed to the default location).
Any backslashes in the path will need to be escaped with an additional backslash (forward slashes can also be used but do not need to be escaped). For example:
Whether to propagate zeros across all levels (fact tables) in a given table.
Without this setting, perturbation is not coordinated, so there is a risk that an attacker could exploit this fact to determine that a zero estimate was non zero before perturbation was applied. When this setting is enabled, the perturbation method first perturbs all levels as usual and then sets the corresponding estimates for all other levels in the table to zero, if the estimate for any other level in the table is found to be zero.
This setting also coordinates perturbation with measures: if a fact table count is perturbed to 0, then the measures for that fact table will also be perturbed to 0.
To apply zero propagation, use the following command:
The available settings are:
The propagation threshold. Use the threshold to control whether a cell can be set to zero by zero propagation from a related level/record count:
If the record count of a cell is less than or equal to this threshold, then it can be set to zero by zero propagation.
For example, the following command ensures that cells with record counts of 5 or less can be set to zero:
|The location of the configuration files for quantile perturbation. By default, these files should be in the same location as the SXV4 file, but you can use these properties to set an alternative location. See Quantiles and Ranges - Perturbation for more details.|
Apply the Plugin
Login to SuperADMIN and create a new method:
This example sets the ID of the new method to
perturbation_method. This ID will be used in all the following examples, although you can replace this with your preferred ID if you wish.
Add the Perturbation Data Control plugin to the method:
This example sets the ID of the plugin within this method to
perturbation. You can replace this with your preferred ID.
Perturbationat the end of this command is the library name for the perturbation module. This is case sensitive and must be specified exactly as shown here.
Set the plugin properties:
Assign the method to a dataset (in this example we are assigning the method to a dataset with the ID
You can review the method details using the command
cat <dataset_id> methods details <method_id>:
Perturbation with Weighted Datasets
If you have weighted datasets, then you must apply an additional data control module,
Average_cellwgt, to your perturbation methods. This module effectively scales up the perturbed amount to account for the weighting.
The average cell weighting module calculates the unweighted cell value, applies perturbation to this, and then multiples the result by the average weight of the cell (calculated as the weighted value divided by the unweighted value):
This ensures that the effect of perturbation is scaled up appropriately to account for the weighting.
When using weighted datasets:
- The average cell weight module must be added to the method after the perturbation module, as it uses the result of the perturbation as part of its calculation.
FREQproperty must be set to
The following is a complete example of perturbation with weighted datasets: