Routing and failover for peak traffic
One OpenAI-compatible endpoint routes across the whole catalog and retries on a backup model the instant a provider degrades. A provider incident during a flash sale does not become a customer-facing outage.
- Fallback chains retry automatically on error or timeout
- Routing strategies: usage, latency, cost, least-busy, shuffle
- Weighted load balancing across deployments per model
- Every routing decision captured in observability