Glossary term
Glossary term
Training and Fine-Tuning
A variant of self-supervised learning that is particularly useful when all of the following conditions are true:
The ratio of unlabeled examples to labeled examples in the dataset is high.
This is a classification problem.
Self-training works by iterating over the following two steps until the model stops improving:
Use supervised machine learning to train a model on the labeled examples.
Use the model created in Step 1 to generate predictions (labels) on the unlabeled examples, moving those in which there is high confidence into the labeled examples with the predicted label.
Notice that each iteration of Step 2 adds more labeled examples for Step 1 to train on.
Created for this library
A document classification team uses self-training to label unlabeled tickets with the model and add high-confidence examples back to training.
A medical NLP team uses self-training to grow a labeled set by adding the model's confident predictions on unlabeled clinical notes.
A research team uses self-training to expand training data when human labels are scarce but unlabeled data is abundant.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License