<< Click to Display Table of Contents >> Navigation: Reference > Transforms > Ngram |
Counts the number of times each sequence of consecutive words appear in the selected column.
Find ngrams of 2 to 3 word length (bigrams and trigrams) in the 'Keywords' column:
Sum the impressions associated with each 2 to 3 word ngram:
•Select the Column you wish to analyze for ngrams.
•Set Minimum N to the minimum number of words in an ngram.
•Set Maximum N to the maximum number of words in an ngram.
•Set Sum to:
oRows to count the number of rows containing the ngram.
oA column to sum numeric values in that column for all rows containing the ngram.
•Uncheck case sensitive to convert everything to lower case before counting ngrams.
•Check drilldown to allow double-clicking a row in the data table to drilldown to the rows that contributed to this row in the upstream data.
•Words are made up of letters, digits and apostrophes ('). All other characters are treated as word separators.
•All letters are converted to lower case.
•If you have set Sum to a column then only numeric values in that column will be summed.
•The output sorted by number of words in the ngram, then the count and then the ngram. Use a Sort transform to sort it in a different order.
•Use a Replace transform before the Ngram transform to remove/replace any words or letters you don't wish to analyze.