Reference > Transforms

Description

Adds a new column with pseudonymised values for the selected column.

Pseudonymising a column of dates of birth:

One.

•Uncheck automatic new column name to set a name for the newly created column in New column name.

•Set Column to the column you want to pseudonymise.

•set Prefix to the text to be prepended to each pseudonym. It can be left blank.

•set Start at for the number to start at.

•Set Random seed to the seed value used by the pseudo-random generator algorithm. This determines the order that pseudonyms are assigned to values.

•The order in which pseudonyms are assigned to values is randomized, to make them harder to reverse.

•The default Random seed value is based on the system clock when the transform was first created.

•0s are added to pseudonyms, where necessary, to make all them all the same length.

•You should generally use a different Prefix for each column pseudonymised.

•You can leave Prefix empty if you just want to assign a number to each value ('Label encoding').

•Any change to the input dataset may change the mapping between values and pseudonyms.

•To create a lookup table to reverse the pseudonymisation:

oUse Remove Cols to remove all columns apart from the original value column and the corresponding pseudonym column.

oUse Dedupe to remove the duplicate rows.

oYou can then output this to a file and use it with Lookup to restore the original values.