Ngram

<< Click to Display Table of Contents >>

Navigation:  Reference > Transforms >

Ngram

Description

Counts the number of times each sequence of consecutive words appear in the selected column.

 

Examples

Find ngrams of 2 to 3 word length (bigrams and trigrams) in the 'Keywords' column:

how to create ngrams

 

Sum the impressions associated with each 2 to 3 word ngram:

how to sum ngrams

 

Inputs

One.

 

Options

Select the Column you wish to analyze for ngrams.

Set Minimum N to the minimum number of words in an ngram.

Set Maximum N to the maximum number of words in an ngram.

Set Sum to:

oRows to count the number of rows containing the ngram.

oA column to sum numeric values in that column for all rows containing the ngram.

Uncheck case sensitive to convert everything to lower case before counting ngrams.

 

Notes

Words are made up of letters, digits and apostrophes ('). All other characters are treated as word separators.

All letters are converted to lower case.

If you have set Sum to a column then only numeric values in that column will be summed.

The output sorted by number of words in the ngram, then the count and then the ngram. Use a Sort transform to sort it in a different order.

Use a Replace transform before the Ngram transform to remove/replace any words or letters you don't wish to analyze.

 

See also

Count

Unique

Total

Pivot

Stats

Summary