Capability Evaluation

A structured test of what a model or system can do, including both intended benefits and dangerous or unexpected capabilities. It differs from ordinary performance testing because it looks for risk-relevant capability thresholds. Capability evaluations should be performed before broad release and repeated when models, tools, scaffolding, or deployment context changes.

Examples

1.
METR (formerly ARC Evals) specializes in autonomous task capability evaluations against frontier models for major AI developers.
2.
The UK AISI and US AISI publish capability evaluation results for frontier models, including pre-deployment testing of Anthropic's and OpenAI's models.
3.
DeepMind's Dangerous Capability Evaluations paper (2024) catalogs evaluations across persuasion, deception, cyber capability, and self-proliferation.

Related terms

Back to glossary

Examples

Related terms

Loading…

Examples

Related terms