Glossary term
Glossary term
Evaluation and Benchmarks
A forgiving form of ROUGE-N that enables skip-gram matching. That is, ROUGE-N only counts N-grams that match exactly, but ROUGE-S also counts N-grams separated by one or more words. For example, consider the following:
reference text: White clouds
generated text: White billowing clouds
When calculating ROUGE-N, the 2-gram, White clouds doesn't match White billowing clouds. However, when calculating ROUGE-S, White clouds does match White billowing clouds.
Created for this library
A summarization team reports ROUGE-S on long-form generation to capture skip-bigram similarity with reference summaries.
A research team uses ROUGE-S in its evaluation suite to capture similarities that strict bigram metrics miss.
A news platform uses ROUGE-S alongside ROUGE-1 and ROUGE-2 to track different aspects of summary quality.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License