Glossary term
Glossary term
Evaluation and Benchmarks
Abbreviation for Choice of Plausible Alternatives.
Created for this library
An LLM evaluation team uses COPA in its standard benchmark suite to test commonsense cause-and-effect reasoning before promoting a model.
A research lab reports COPA scores to compare commonsense reasoning across fine-tuned versions of a foundation model.
A vendor benchmarks its open-weights LLM on COPA in its release notes so enterprise buyers can see causal-reasoning performance.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License