Pseudonym

<< Click to Display Table of Contents >>

Navigation:  Reference > Transforms >

Pseudonym

Description

Adds a new column with pseudonymised values for the selected column.

 

Example

Pseudonymising a column of dates of birth:

 

Pseudonymising a column of dates of birth

 

Inputs

One.

 

Options

Uncheck automatic new column name to set a name for the newly created column in New column name.

Set Column to the column you want to pseudonymise.

set Prefix to the text to be prepended to each pseudonym. It can be left blank.

set Start at for the number to start at.

Set Random seed to the seed value used by the pseudo-random generator algorithm. This determines the order that pseudonyms are assigned to values.

 

Notes

The order in which pseudonyms are assigned to values is randomized, to make them harder to reverse.

The default Random seed value is based on the system clock when the transform was first created.

0s are added to pseudonyms, where necessary, to make all them all the same length.

You should generally use a different Prefix for each column pseudonymised.

Any change to the input dataset may change the mapping between values and pseudonyms.

To create a lookup table to reverse the pseudonymisation:

oUse Remove Cols to remove all columns apart from the original value column and the corresponding pseudonym column.

oUse Dedupe to remove the duplicate rows.

oYou can then output this to a file and use it with Lookup to restore the original values.

 

Pseudonymizing data

Pseudonymizing data

 

Recovering pseudonymised data

Recovering pseudonymised data

 

See also

Hash

Row Num

UUID