CommitmentBank (CB)

A dataset for evaluating an LLM's proficiency in determining whether the author of a passage believes a target clause within that passage. Each entry in the dataset contains:

A passage

A target clause within that passage

A Boolean value indicating whether the passage's author believes the target clause

For example:

Passage: What fun to hear Artemis laugh. She's such a serious child. I didn't know she had a sense of humor.

Target clause: she had a sense of humor

Boolean: True, which means the author believes the target clause

CommitmentBank is a component of the SuperGLUE ensemble.

Real-world uses

Created for this library

1.
An LLM evaluation team includes CommitmentBank in its standard benchmark suite to test how well models identify embedded commitments.
2.
A research lab reports CommitmentBank scores in its model card so downstream users can compare entailment-style reasoning across model versions.
3.
A model release team uses CommitmentBank as one of several reading-comprehension benchmarks to gate model promotions.

Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License

Back to glossary

Real-world uses

Loading…

Real-world uses