Glossary term
Glossary term
Foundations
Parameters are internal values a language model learns during training. They control how the model interprets language, forms associations, and generates responses. In simple terms, more parameters generally mean the model can capture more complexity, but also requires more computation.
GPT-3 has 175 billion parameters; Meta Llama 4 Behemoth is reported at over 2 trillion parameters.
Mistral 7B and Phi-3 Mini at 3.8 billion parameters showed that smaller models can compete on benchmarks.
DeepSeek V3 uses a mixture-of-experts architecture with 671 billion total and 37 billion activated parameters per token.