Join

<< Click to Display Table of Contents >>

Navigation:  Reference > Transforms >

Join

Description

Merge two inputs based on common (key) columns, e.g. on an email address or ID column present in both inputs. The bottom dataset (by vertical position) is joined to the right of the top dataset.

 

Examples

Join two datasets side-by-side matching the 'ID' columns, keeping only IDs in both datasets:

 

how to join 2 datasets uisng a unique id example

 

Join 2 datasets side-by-side matching the row numbers, keeping all rows:

 

Join by row number

 

Inputs

Two.

 

Options

The output depends on the vertical (Y-axis) position of the inputs.

Click Explore Keys... to compare key values in the 2 datasets.

Set By to:

oKey to  join by common key columns.

oRow number to join by the row number of the dataset.

Select Top key column for the column you want to match in the top input dataset.

Select include top non-matching rows if you want to include in the output any rows in the top input with no matching value in the bottom input.

Select Bottom key column for the column you want to match in the bottom input dataset.

Select include bottom non-matching rows if you want to include in the output any rows in the bottom input with no matching value in the top input.

Check detailed report to show detailed information in the Warnings and Info tabs on duplicate keys in each dataset, overlap of keys in the 2 datasets and leading and trailing whitespace in keys. This slows down the join significantly.

Check case sensitive to use case sensitive matching for keys.

 

include top non-matching rows checked

include bottom non-matching rows checked

Also known as:

No

No

Inner join

No

Yes

Right outer join

Yes

No

Left outer join

Yes

Yes

Full outer join

 

Notes

Join merges two datasets side-by-side (horizontally). To merge datasets one on top of the other (vertically) use Stack.

If there are 10k rows or less in both datasets, Easy Data Transform will try to guess sensible default values for Top key column and Bottom key column based on column header names and contents.

All values are treated as text. You can use Whitespace to remove whitespace before the join.

If a key value occurs M times in the first dataset and N times in the second dataset, you will get M x N rows with this key value. You can use Unique to remove rows with duplicate key values.

If you need more than one column for the key, use Concat Cols to create that column.

Use Row Num to create a unique key column.

Use Sort to sort the results.

Use the Cross transform for cross joins.

Messages are shown in the Warnings tab for keys that occur more than once in either dataset (duplicates).

Messages are shown in the Info tab for keys that only occur in 1 of the 2 datasets (misses).

Use Subtract to see rows not joined from the top or bottom dataset (an 'anti-join'). The Clone is necessary to get the correct Y position of the bottom dataset for the Subtract.

Join subtract

Cascade multiple joins to join more than 2 datasets.

 

cascading join example

 

See also

Video: How to join Excel files

Cross

Lookup

Interpolate

Merge datasets