Glossary term
Glossary term
Agentic Systems
System that dynamically routes queries to different models based on task complexity, cost, latency requirements, or topic.
LLM Router (2024, Imbue/Berkeley) trains a small classifier to route simple queries to cheap small models (Haiku) and complex queries to frontier models (GPT-4o), reducing API costs by 40% with less than 1% quality degradation.
Martian (Martin AI) builds an LLM router that dynamically selects from 10+ models per query based on task type, budget, and latency constraints - used by enterprise API consumers spending $100k+/month on LLM inference.
AWS Bedrock's intelligent prompt routing (2024) uses a routing model to direct prompts between Claude 3 Haiku and Claude 3 Sonnet, cutting costs by 30% for customer service deployments while maintaining response quality thresholds.