Choice of Plausible Alternatives (COPA)

A dataset for evaluating how well an LLM can identify the better of two alternative answers to a premise. Each of the challenges in the dataset consists of three components:

A premise, which is typically a statement followed by a question

Two possible answers to the question posed in the premise, one of which is correct and the other incorrect

The correct answer

For example:

Premise: The man broke his toe. What was the CAUSE of this?

Possible answers:

He got a hole in his sock.

He dropped a hammer on his foot.

Correct answer: 2

COPA is a component of the SuperGLUE ensemble.

Real-world uses

Created for this library

1.
An evaluation team includes COPA in its reasoning benchmark suite to test commonsense cause-and-effect reasoning in candidate LLMs.
2.
A model release group tracks COPA performance over time to catch regressions in commonsense reasoning across fine-tuning passes.
3.
A research lab reports COPA scores in its preprint to compare its model against published commonsense reasoning baselines.

Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License

Back to glossary

A dataset for evaluating how well an LLM can identify the better of two alternative answers to a premise. Each of the challenges in the dataset consists of three components:

A premise, which is typically a statement followed by a question

Two possible answers to the question posed in the premise, one of which is correct and the other incorrect

The correct answer

For example:

Premise: The man broke his toe. What was the CAUSE of this?

Possible answers:

He got a hole in his sock.

He dropped a hammer on his foot.

Correct answer: 2

COPA is a component of the SuperGLUE ensemble.

Real-world uses

Created for this library

1.
An evaluation team includes COPA in its reasoning benchmark suite to test commonsense cause-and-effect reasoning in candidate LLMs.
2.
A model release group tracks COPA performance over time to catch regressions in commonsense reasoning across fine-tuning passes.
3.
A research lab reports COPA scores in its preprint to compare its model against published commonsense reasoning baselines.

Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License

Back to glossary

Real-world uses

Loading…

Real-world uses