Glossary term
Glossary term
Training and Fine-Tuning
Synthetic data generation involves creating artificial data, like text, images, or records, to train or test AI models. It is especially useful when real data is limited, sensitive, or needs to be balanced for fairness.
Gretel.ai and Mostly AI are commercial synthetic data platforms for tabular and time-series data.
Nvidia Omniverse Replicator generates synthetic image data for computer-vision model training.
Microsoft Phi-3 and Anthropic Claude were trained partly on high-quality synthetic data.