Glossary term
Glossary term
Evaluation and Benchmarks
Abbreviation for CommitmentBank.
Created for this library
An LLM evaluation team includes CB in its reasoning benchmark suite to test how well models classify embedded commitments in short passages.
A research lab reports CB results in its model card so users can compare entailment-style reasoning across model versions.
A model release group treats CB regression as a release blocker because commitment understanding is central to several downstream agent tasks.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License