Glossary term
Glossary term
Foundations
Data drawn from a distribution that doesn't change, and where each value drawn doesn't depend on values that have been drawn previously. An i.i.d. is the ideal gas of machine learning—a useful mathematical construct but almost never exactly found in the real world. For example, the distribution of visitors to a web page may be i.i.d. over a brief window of time; that is, the distribution doesn't change during that brief window and one person's visit is generally independent of another's visit. However, if you expand that window of time, seasonal differences in the web page's visitors may appear.
See also nonstationarity.
Created for this library
A risk modeling team flags that its data is not independently and identically distributed and uses temporal cross-validation accordingly.
A research team checks whether its training and test sets are independently and identically distributed before reporting confidence intervals.
An A/B testing platform educates analysts that independently and identically distributed assumptions are needed for the simplest variance estimators.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License