Post-Training

The set of techniques applied after base-model training to shape behavior, improve usefulness, reduce harmful outputs, or align the model with instructions and policies. Post-training changes can materially affect behavior, so they should be documented, evaluated, and included in release and monitoring decisions.

Examples

1.
Modern LLM post-training pipelines combine supervised fine-tuning, RLHF, and methods like Direct Preference Optimization (Rafailov et al., 2023).
2.
Anthropic's Constitutional AI approach uses self-supervised post-training with a documented constitution to align Claude.
3.
Meta's Llama 3 and Llama 3.1 documentation detail post-training stages including SFT, rejection sampling, and DPO.

Related terms

Back to glossary

Examples

Related terms

Loading…

Examples

Related terms