Glossary term
Glossary term
Infrastructure and Serving
Inference strategy that maintains k candidate sequences (beams) and selects the highest-probability complete sequence.
Google Translate uses beam search (beam width=4) for machine translation - maintaining 4 candidate translations and selecting the globally highest-probability one, improving BLEU scores vs greedy decoding.
BART summarisation uses beam search with length penalty to produce abstractive summaries - the length penalty prevents the model from always choosing the shortest high-probability summary.
MarianMT (HuggingFace) uses beam search for low-resource language translation tasks where sampling-based methods produce incoherent outputs - beam search maintains grammatical consistency at the cost of diversity.