Datasets will often contain statistical outliers, particularly when the dataset is collected from self-reported sources, such as online surveys. These outliers can badly skew the results, so they need to be dealt with before the dataset can be used. Easy Data Transfrom can help you easily find and deal with statistical outliers.
The simplest way to find outliers is to profile the data by column. This will allow you to find suspicious looking minimum and maximum values. It will also show any non-numeric values.
You can then edit the original data or use a Replace transform (to replace problem values) or a Filter transform (to remove problem rows).
The Outliers transform allows a more sophisticated approach finding statistical outliers in data, based on either the IQR (inter quartile range) or the standard deviation of the data in a column. Check show advanced in the Left pane to show the Outliers transform.
Select the column of data you want to analyze for outliers. Set Score by depending if you want to score values based on either the IQR or standard deviation. Set Threshold depending according to how wide a range you expect the data to have (where the upper and lower fences are).
Check add score column to add a column showing the score for each value.
The Warnings tab warns you about any non-numeric values, and the Info tab gives you information on the IQR/standard deviation.
Set Action according to the action to take for values identified as outliers:
See the video above for more details.
Easy Data Transform can also help with merging, cleaning, filtering, enriching and reshaping your data. All with a few clicks and no coding required.
v1.39.1 for Windows 11 / 10 / 8 / 7 (46 MB)
v1.39.1 for Mac 13.x to 10.13 (77 MB)
Questions or problems?