Data Augmentation

Data Augmentation expands training data by generating or tweaking existing examples, like rephrasing sentences or adding noise, improving model robustness and performance without collecting large amounts of new data.

Artificially boosting the range and number of training examples by transforming existing examples to create additional examples. For example, suppose images are one of your features, but your dataset doesn't contain enough image examples for the model to learn useful associations. Ideally, you'd add enough labeled images to your dataset to enable your model to train properly. If that's not possible, data augmentation can rotate, stretch, and reflect each image to produce many variants of the original picture, possibly yielding enough labeled data to enable excellent training.

Examples

1.
Albumentations is a Python library widely used for image data augmentation in PyTorch training pipelines.
2.
Gretel.ai and Mostly AI provide synthetic data augmentation for tabular enterprise datasets.
3.
NLP teams use libraries like nlpaug to augment training data with paraphrasing and back-translation.

Real-world uses

Created for this library

1.
A computer vision team uses data augmentation with random crops, flips, and color jitter to make its detector robust without collecting more images.
2.
A speech recognition vendor uses data augmentation with simulated noise and reverberation to make its model robust on smartphone microphones.
3.
An NLP team uses back-translation as a data augmentation strategy to expand its training set for low-resource languages.

Back to glossary

Examples

1.
Albumentations is a Python library widely used for image data augmentation in PyTorch training pipelines.
2.
Gretel.ai and Mostly AI provide synthetic data augmentation for tabular enterprise datasets.
3.
NLP teams use libraries like nlpaug to augment training data with paraphrasing and back-translation.

Real-world uses

Created for this library

1.
A computer vision team uses data augmentation with random crops, flips, and color jitter to make its detector robust without collecting more images.
2.
A speech recognition vendor uses data augmentation with simulated noise and reverberation to make its model robust on smartphone microphones.
3.
An NLP team uses back-translation as a data augmentation strategy to expand its training set for low-resource languages.

Back to glossary

Data Augmentation

Examples

Real-world uses

Related terms

Loading…

Data Augmentation

Examples

Real-world uses

Related terms