Glossary term
Glossary term
Evaluation and Benchmarks
Abbreviation for Boolean Questions.
Created for this library
An evaluation team tracks BoolQ accuracy alongside MMLU when assessing a new open-source LLM before recommending it for a customer use case.
A model release group includes BoolQ in its standard benchmark suite so reading-comprehension regressions are caught before launch.
A research lab reports BoolQ in its preprint to position its model against other foundation models on factual yes-no questions.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License