Glossary term
Glossary term
Training and Fine-Tuning
The set of techniques applied after base-model training to shape behavior, improve usefulness, reduce harmful outputs, or align the model with instructions and policies. Post-training changes can materially affect behavior, so they should be documented, evaluated, and included in release and monitoring decisions.
Modern LLM post-training pipelines combine supervised fine-tuning, RLHF, and methods like Direct Preference Optimization (Rafailov et al., 2023).
Anthropic's Constitutional AI approach uses self-supervised post-training with a documented constitution to align Claude.
Meta's Llama 3 and Llama 3.1 documentation detail post-training stages including SFT, rejection sampling, and DPO.