Glossary term
Glossary term
Foundations
A collection of raw data, commonly (but not exclusively) organized in one of the following formats:
a spreadsheet
a file in CSV (comma-separated values) format
Created for this library
A retail analytics team curates a 90-day dataset of transactions to retrain its demand model every quarter.
A medical research team versions its datasets with content hashes so any model can be retrained from the exact same input bytes.
A risk modeling team locks the training dataset at the start of a release so reviewers can audit which records influenced the production scorecard.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License