TruthfulQA

Benchmark measuring whether a model produces truthful answers or mimics human misconceptions and falsehoods.

1.
TruthfulQA (Lin et al. 2021, OpenAI/Oxford) tests 817 questions covering health myths, conspiracy theories, and common misconceptions - Llama 3.1 70B achieves 65% truthfulness vs Claude 3 Opus at 88%.
2.
Healthcare AI procurement teams use TruthfulQA health and science categories to vet models for patient-facing applications - a model that affirms myths like 'vaccines cause autism' is disqualified regardless of MMLU score.
3.
Constitutional AI training improves TruthfulQA performance by reducing sycophantic agreement with false premises - Anthropic publishes TruthfulQA scores in model cards as evidence of reduced hallucination.

Loading…