Verify

<< Click to Display Table of Contents >>

Navigation:  Reference > Transforms >

Verify

Description

Verify that values in the dataset are as expected.

 

Examples

Show an error and stop processing if the dataset doesn't have 9 columns:

 

Verify number of columns

 

Remove any rows where there is an empty (0 length) value in the 'name' column:

 

Remove rows with an empty value

 

Show a warning message and color any duplicate values in the 'id' column:

 

Show a warning message for any duplicate values in the 'id' column

 

Check for invalid telephone numbers or email addresses and keep only rows with validation fails:

 

Check for invalid telephone numbers or email addresses

 

Inputs

One.

 

Options

Choose the rules you wish to verify, for the dataset and for each column.

oCheck each verification rule you wish to apply.

oSet Value for rules that require a value.

oSet Severity according to the severity of a rule fail.

oHover over the rules to see tooltips with more information on each rule.

 

The rules available for each column depends on the Column type selected.

 

Select verify column type

 

 

Description (in alphabetic order)

Notes

Available for

Case sensitive option applies

Applies

At least 1 non-empty

Verify that at least 1 value in the column is not empty. A value containing whitespace is not considered empty.

Column type=Text

Column type=Numeric (real)

Column type=Numeric (integer)

Column type=Date

No

Per column

Contains

Verify that every value in the column contains the text supplied. Matching is sensitive to whitespace.

Column type=Text

Column type=Numeric (real)

Column type=Numeric (integer)

Yes

Per value in column

Don't allow listed values

Verify that no values in the column belong to this list. Separate multiple values by Commas. Matching is sensitive to whitespace.

Column type=Text

Yes

Per value in column

Ends with

Verify that every value in the column ends with the text supplied. Matching is sensitive to whitespace.

Column type=Text

Column type=Numeric (real)

Column type=Numeric (integer)

Yes

Per value in column

Integer except listed

Verify that every value in the column is an integer (whole) number or belongs to the list of special values supplied. Separate multiple values by Commas. List matching is sensitive to whitespace.

Column type=Numeric (integer)

Yes (for listed special values)

Per value in column

Is local file

Verify that every value in the column is the location (path) of an existing file on this computer.

Column type=Text

No

Per value in column

Is local folder

Verify that every value in the column is the location (path) of an existing folder on this computer.

Column type=Text

No

Per value in column

Is lower case

Verify that any letters are lower case.

Column type=Text

No

Per value in column

Is sentence case

Verify that any values are sentence case. E.g.: Sentence case.

Column type=Text

No

Per value in column

Is title case

Verify that any values are title case. E.g.: Title Case.

Column type=Text

No

Per value in column

Is upper case

Verify that any letters are upper case.

Column type=Text

No

Per value in column

Is valid EAN13

Verify that every value in the column is a valid EAN13 barcode number (13 digits, including a valid checksum digit). Whitespace is ignored.

Column type=Numeric (integer)

No

Per value in column

Is valid email

Verify that every value in the column is a valid email address. E.g.:

email@email.com

EMAIL@EMAIL.COM

Does not check if the email actually exists.

Column type=Text

No

Per value in column

Is valid telephone num.

Verify that every value in the column is a valid telephone number. E.g.:

(123)-123-1234

1-123-123-1234

123.123.1234

+12 (123) 123-1234

11231231234

+1 123 123-1234

Does not check if the phone number  actually exists.

Column type=Text

Column type=Numeric (integer)

No

Per value in column

Is valid UPC-A

Verify that every value in the column is a valid UPC-A barcode number (12 digits, including a valid checksum digit). Whitespace is ignored.

Column type=Numeric (integer)

No

Per value in column

Match column name

Verify that column has the supplied name. Matching is sensitive to whitespace.

Column type=Text

Column type=Selection

Column type=Numeric (real)

Column type=Numeric (integer)

Column type=Date

Yes

Per column

Matches regex

Verify that every value in the column matches the regular expression supplied. E.g. $x.*$ to match values that start with x. Matching is sensitive to whitespace.

Column type=Text

Yes

Per value in column

Maximum characters

Verify the maximum number of characters in each value in the column. Whitespace characters are counted.

Column type=Text

Column type=Numeric (real)

Column type=Numeric (integer)

Column type=Date

No

Per value in column

Maximum number of cols

Verify the maximum number of columns in the dataset.

Dataset

No

Per dataset

Maximum number of rows

Verify the minimum number of rows in the dataset.

Dataset

No

Per dataset

Maximum value

Verify the maximum value in the column. Numbers equal to this value are ok.

Column type=Numeric (real)

No

Per value in column

Verify the maximum value in the column. Numbers equal to this value are ok.

Column type=Numeric (integer)

No

Per value in column

Verify the last allowed date in the column. Dates equal to this value are ok.

Column type=Date

No

Per value in column

Minimum characters

Verify the minimum number of characters in each value in the column. Whitespace characters are counted.

Column type=Text

Column type=Numeric (real)

Column type=Numeric (integer)

Column type=Date

No

Per value in column

Minimum number of cols

Verify the minimum number of columns in the dataset.

Dataset

No

Per dataset

Minimum number of rows

Verify the minimum number of rows in the dataset.

Dataset

No

Per dataset

Minimum value

Verify the minimum value in the column. Numbers equal to this value are ok.

Column type=Numeric (real)

No

Per value in column

Verify the minimum value in the column. Numbers equal to this value are ok.

Column type=Numeric (integer)

No

Per value in column

Verify the first allowed date in the column. Dates equal to this value are ok.

Column type=Date

No

Per value in column

No blank values

Verify that no value is all whitespace characters.

Column type=Text

No

Per value in column

No Carriage Returns

Verify that values do not contain Carriage Return characters.

Column type=Text

No

Per value in column

No currency

Verify that values do not contain currency characters. E.g. $, £ or .

Column type=Text

No

Per value in column

No digits

Verify that values do not contain digit characters E.g. 0 or 9.

Column type=Text

No

Per value in column

No double spaces

Verify that values do not contain consecutive Space characters.

Column type=Text

No

Per value in column

No duplicate col names

Verify that no 2 column names in the dataset are the same. Matching is sensitive to whitespace.

Dataset

Yes

Per column

No duplicate values

Verify that no value in the column appears more than once. Matches as text. Matching is sensitive to whitespace.

Column type=Text

Yes

Per value in column

Verify that no value in the column appears more than once. Matches as integers. E.g. 1, +1, and 01 are duplicates.

Column type=Numeric (integer)

No

Per value in column

Verify that no value in the column appears more than once. Matches as dates. E.g. 01/01/2000 and 1/1/2000 are duplicates.

Column type=Date

No

Per value in column

No empty rows

Verify that no rows in the dataset have an empty value for every column. A value containing whitespace is not considered empty.

dataset

No

Per row

No empty values

Verify that no values in the column are empty. A value containing whitespace is not considered empty.

Column type=Text

Column type=Numeric (real)

Column type=Numeric (integer)

Column type=Date

No

Per value in column

No gaps in values

Verify every possible value between the minimum and maximum values in the column occurs at least once. E.g. 4, 1, 2, 4 has a gap at 3.

Column type=Numeric (integer)

No

Per column

Verify every possible date between the minimum and maximum values in the column occurs at least once. E.g. for dd/MM/yyyy format 01/01/2000, 02/01/2000 and 04/01/2000 has a gap at 03/01/2000.

Column type=Date

No

Per column

No leading/trailing whitespace

Verify that values do not contain leading or trailing whitespace characters.

Column type=Text

No

Per value in column

No Line Feeds

Verify that values do not contain Line Feed characters.

Column type=Text

No

Per value in column

No non-ASCII

Verify that values do not contain non-ASCII characters.

Column type=Text

No

Per value in column

No non-printable

Verify that values do not contain non-printable characters, such as Null.

Column type=Text

No

Per value in column

No punctuation

Verify that values do not contain punctuation characters. E.g. !, ; or ..

Column type=Text

No

Per value in column

No symbols

Verify that values do not contain symbol characters. E.g. $, +, = or >.

Column type=Text

No

Per value in column

No Tab characters

Verify that values do not contain Tab characters.

Column type=Text

No

Per value in column

No whitespace

Verify that values do not contain whitespace characters. E.g. Space, Tab, Carriage Return or Line Feed.

Column type=Text

Column type=Numeric (real)

Column type=Numeric (integer)

Column type=Date

No

Per value in column

Numeric except listed

Verify that every value in the column is a real number or belongs to the list of special values supplied. Separate multiple values by Commas. List matching is sensitive to whitespace.

Column type=Numeric (real)

Yes (for special listed values)

Per value in column

Only allow listed values

Verify that all values in the column belong to this list. Separate multiple values by Commas. Matching is sensitive to whitespace.

Column type=Selection

Yes

Per value in column

Require listed values

Verify that every listed value appears at least once in the column, matching as text. Separate multiple values by Commas. Matching is sensitive to whitespace.

Column type=Text

Column type=Numeric (integer)

Yes

Per column

Starts with

Verify that every value in the column starts with the text supplied. Matching is sensitive to whitespace.

Column type=Text

Column type=Numeric (real)

Column type=Numeric (integer)

Yes

Per value in column

Valid date in format

Verify that every value in the column is a valid date in the supplied date format. The rule will fail if either:

The date is in the wrong format. E.g. 31/01/2000 does not match format yyyy/MM/dd.

The date is in the right format, but does not exist. E.g. 2000/02/30 is a non existent date in format yyyy/MM/dd.

Column type=Date

No

Per value in column

 

Check case sensitive to use case sensitive matching. Case sensitivity is only relevant to some rules (see the table above).

Set Filter according to which rows you wish to keep. E.g. you might want to remove rows with warnings.

 

Notes

Any verification fails are color-coded in the Right pane data table. Hover over a header or value for information on the rule(s) failed. Click on the color buttons to change the colors used.

 

Verify color coding

 

A summary of all the verification rules and whether they passed or failed is output to the Info tab.

A summary of failed verification rules with Severity set to Warning or Error is output to the Warning tab.

Checking a rule may hide other, mutually exclusive, rules. E.g. checking No empty values for a column will hide the Minimum characters rule.

All date verification rules use the date format in Valid date in format rule, if set. Otherwise Easy Data Transform will attempt to infer the date format from Preferences>Dates and the values in each column.

The locale  is used to decide the validity of numbers and dates. E.g. '1,234.5' is a valid real number in some locales.

Some rules have associated lists of values. The list is Comma separated, consequently, values cannot contain Commas.

If you want to verify the relationship between 2 columns (e.g. the value is column A is always greater than the value in column B for the same row) use a transform, such as Compare Cols or If and then use Verify on the new column.

 

See also

Schemas

Filter