<< Click to Display Table of Contents >> Navigation: How do I? > Dedupe a dataset |
If you want to remove duplicate entries from a dataset, use the Unique transform. For example, to remove the 2 rows that have the same email from this dataset:
To get this dataset:
Drag the dataset file onto the Center pane of Easy Data Transform.
Select the dataset then click the Unique transform in the Left pane.
Set the Email column to Keep unique in the Right pane. Set the First and Last columns to Keep first.
Only one row with each email is kept. The first and last names are set to the first occurrence in the sort order. Use Sort if you want to change the order before removing duplicates.
If you only want to remove rows with the same first name, same last name and same email, set First, Last and Email columns to Keep unique.
Note that deduplicating columns takes account of whitespace and case. So you might need to do Whitespace and Case transforms before the dedupe.
See the Unique documentation for a more detailed example.