Real-world data is often messy and inexact. The name of a company might be spelled in various different ways, with different uses of spaces and punctuation. For example, McDonalds might be spelled “McDonald’s”, “McDonalds”, “McDonald”, “Macdonalds”, “Mc Donalds” or “McDonald’s Corp.”. But you can cross-reference two inexactly matching datasets using the Fuzzy matching option in Easy Data Transform’s Lookup transform.
Easy Data Transform scores fuzzy matches against “McDonald’s” (case insensitive) as:
Text | Fuzzy match score |
---|---|
McDonald’s | 100% |
McDonalds | 90% |
McDonald | 80% |
Macdonalds | 80% |
Mc Donalds | 80% |
McDonald’s Corp. | 62% |
McDonald’s Corporation | 45% |
You can try fuzzy matching with Easy Data Transform yourself:
Note that you can set Bottom value column to the same as Bottom key column to see the matches.
If you want to do a fuzzy join, you can do this by doing a fuzzy lookup to add a column to the top dataset. Then use use this column as the key column to join the two datasets using a Join transform.
Note that fuzzy matching is significantly slower than exact matching. Especially if you are matching long items of text.
See the video above for more details.
Lookup is just one of the 66 transforms that Easy Data Transform supports. Easy Data Transform can also help with converting, cleaning, blending, filtering and enriching your data. All without coding.
v1.46.5 for Windows 11 / 10 / 8 / 7 (47 MB)
Zip file version
v1.46.5 for Mac 14.x to 10.13 (79 MB)
No commitments.
You can uninstall any time.
You don't even have to give us your email address.
Questions or problems?