Glossary term
Glossary term
Infrastructure and Serving
Middleware layer routing, monitoring, governing, and abstracting calls to model providers.
Kong AI Gateway provides unified routing across OpenAI, Anthropic, and Azure OpenAI endpoints - a fintech uses it to route GPT-4o for reasoning tasks and Claude 3 Haiku for high-volume classification, with automatic fallback.
Portkey.ai is used by Indian AI startups to add retry logic, semantic caching, cost tracking, and guardrail integration across all LLM provider calls - reducing total AI infrastructure spend by 40% through caching.
LiteLLM (open-source) is deployed by enterprises as an internal LLM gateway, providing a unified OpenAI-compatible API that routes to any of 100+ providers - used to decouple application code from model-provider changes.