Glossary term
Glossary term
Evaluation and Benchmarks
Abbreviation for Reading Comprehension with Commonsense Reasoning Dataset.
Created for this library
An LLM evaluation team includes ReCoRD in its standard reasoning benchmark suite to test commonsense reading comprehension.
A research lab reports ReCoRD scores in model cards so downstream users can compare commonsense reasoning across model versions.
A model release team uses ReCoRD as a reading-comprehension benchmark gating production promotion.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License