Dedupe a dataset

<< Click to Display Table of Contents >>

Navigation:  How do I? >

Dedupe a dataset

If you want to remove duplicate entries from a dataset, use the Unique transform. For example, to remove the 2 rows that have the same email from this dataset:


dedupe example


To get this dataset:


deduped example


Drag the dataset file onto the Center pane of Easy Data Transform.


dedupe excel sheet


Select the dataset then click the Unique transform in the Left pane.


dedupe example


Set the Email column to Keep unique in the Right pane. Set the First and Last columns to Keep first.


dedupe example


Only one row with each email is kept. The first and last names are set to the first occurrence in the sort order. Use Sort if you want to change the order before removing duplicates.


If you only want to remove rows with the same first name, same last name and same email, set First, Last and Email columns to Keep unique.


Note that de-duplicating columns takes account of whitespace and case. So you might need to do Whitespace and Case transforms before the dedupe.


See the Unique documentation for a more detailed example.