Glossary term
Glossary term
Foundations
Storing only the position(s) of nonzero elements in a sparse feature.
For example, suppose a categorical feature named species identifies the 36 tree species in a particular forest. Further assume that each example identifies only a single species.
You could use a one-hot vector to represent the tree species in each example. A one-hot vector would contain a single 1 (to represent the particular tree species in that example) and 35 0s (to represent the 35 tree species not in that example). So, the one-hot representation of maple might look something like the following:

Alternatively, sparse representation would simply identify the position of the particular species. If maple is at position 24, then the sparse representation of maple would simply be:
24
Notice that the sparse representation is much more compact than the one-hot representation.
Note: You shouldn't pass a sparse representation as a direct feature input to a model. Instead, you should convert the sparse representation into a one-hot representation before training on it.
Click the icon for a slightly more complex example.
Click the icon if you are confused.
See Working with categorical data in Machine Learning Crash Course for more information.
For example, suppose a categorical feature named species identifies the 36 tree species in a particular forest. Further assume that each example identifies only a single species.
Created for this library
An ML team uses sparse representations like TF-IDF as a baseline before evaluating dense embedding alternatives.
A search team uses sparse representations from BM25 alongside dense embeddings in a hybrid retrieval system.
A research team uses sparse representations to keep memory bounded when working with very large vocabularies.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License