Fuzzy matching

<< Click to Display Table of Contents >>

Navigation:  Reference >

Fuzzy matching

Easy Data Transform supports the use of fuzzy matching in various transforms, including the Cluster, Dedupe , If, Filter and Lookup transforms.

 

Fuzzy matching allows you to match 2 values that are similar, but not identical. For example if you set fuzzy match closeness to 80% then 2 values that are 80% the same or better are considered a match.

 

For example, doing a fuzzy match to this value:

 

100 avenue street, townsville, ohio

 

Gives:

 

Value

Fuzzy closeness

100 avenue street, townsville, ohio

100%

100 avnue street, townsville, ohio

98%

100 avenue street townsville ohio

95%

100 avenue st., townsville, ohio

89%

100 avenue st, townsville

72%

100 av. st., citysville, texas

52%

townsville, ohio

46%

742 evergreen terrace, springfield, oregon

36%

 

Fuzzy matching treats everything as a string and takes account of whitespace. It can optionally take account of case.

 

Fuzzy matching is significantly slower than exact matching. The closeness score is based on Levenshtein distance.