Glossary term
Glossary term
Training and Fine-Tuning
Reusing the examples of a minority class in a class-imbalanced dataset in order to create a more balanced training set.
For example, consider a binary classification problem in which the ratio of the majority class to the minority class is 5,000:1. If the dataset contains a million examples, then the dataset contains only about 200 examples of the minority class, which might be too few examples for effective training. To overcome this deficiency, you might oversample (reuse) those 200 examples multiple times, possibly yielding sufficient examples for useful training.
You need to be careful about over overfitting when oversampling.
Contrast with undersampling.
P
For example, consider a binary classification problem in which the ratio of the majority class to the minority class is 5,000:1. If the dataset contains a million examples, then the dataset contains only about 200 examples of the minority class, which might be too few examples for effective training. To overcome this deficiency, you might oversample (reuse) those 200 examples multiple times, possibly yielding sufficient examples for useful training.
You need to be careful about over overfitting when oversampling.
Contrast with undersampling.
Created for this library
A fraud team uses oversampling of positive fraud cases during training to handle the strong class imbalance.
A retention team uses oversampling of churned customers in its training data to balance the gradient signal.
A medical screening team uses oversampling of rare-disease cases so the model sees enough positive examples during training.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License