Glossary term
Glossary term
Foundations
An ordered sequence of N words. For example, truly madly is a 2-gram. Because order is relevant, madly truly is a different 2-gram than truly madly.
Many natural language understanding models rely on N-grams to predict the next word that the user will type or say. For example, suppose a user typed happily ever. An NLU model based on trigrams would likely predict that the user will next type the word after.
Contrast N-grams with bag of words, which are unordered sets of words.
See Large language models in Machine Learning Crash Course for more information.
Created for this library
A customer feedback team uses n-gram features in a logistic regression as a quick interpretable baseline before evaluating embeddings.
A search-quality team uses n-gram statistics from query logs to power query autocomplete suggestions.
A spam filtering team uses n-gram counts of suspicious word sequences as features in its production email scorer.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License