Fuzzy matching

<< Click to Display Table of Contents >>

Navigation:  Reference >

Fuzzy matching

Easy Data Transform supports the use of fuzzy matching in various transforms, including the Cluster, Dedupe , If, Filter and Lookup transforms.

 

Fuzzy matching allows you to match 2 values that are similar, but not identical. For example if you set fuzzy match closeness to 80% then 2 values that are 80% the same or better are considered a match.

 

For example, doing a fuzzy match to this value:

 

100 avenue street, townsville, ohio

 

Gives:

 

Value

Fuzzy closeness

100 avenue street, townsville, ohio

100%

100 avnue street, townsville, ohio

98%

100 avenue street townsville ohio

95%

100 avenue st., townsville, ohio

89%

100 avenue st, townsville

72%

100 av. st., citysville, texas

52%

townsville, ohio

46%

742 evergreen terrace, springfield, oregon

36%

 

Fuzzy matching treats everything as text and takes account of whitespace. It can optionally take account of case.

 

Fuzzy matching is significantly slower than exact matching. The closeness score is based on Levenshtein distance.