Tag

routing

5 posts tagged "routing".

All posts All tags

Posts

Latest first

Engineering

Cost vs Usage: Finding the Quietly Expensive Model

Request count and dollar cost tell different stories. Here is how cost-vs-usage analytics surface the low-volume model that dominates your bill — and the cheap one you can route more traffic to.

Nemo Team

2026-06-158 min

Guides

What Is an LLM Gateway? A 2026 Primer

An LLM gateway is a single endpoint that routes to every model provider while handling keys, cost, rate limits, and safety. Here is what it does, when you need one, and how to evaluate it.

Nemo Team

2026-06-148 min

Guides

Routing LLM Traffic by Cost vs Quality

Not every request needs your most expensive model. Here is a decision framework for routing LLM traffic by cost and quality — which tasks to send cheap, which to send premium, and how to prove the split works.

Nemo Team

2026-06-108 min

Engineering

Deterministic A/B Testing Across Model Variants

Randomly splitting LLM traffic gives you flaky, unrepeatable experiments. Here is how hash-based deterministic A/B testing splits traffic consistently per user, so your model comparison is actually measurable.

Nemo Team

2026-06-109 min

Engineering

Provider Fallback Chains: Surviving an OpenAI Outage

When a provider 5xxs or rate-limits you, your app shouldn't go down with it. Here is how fallback chains on an LLM gateway reroute to a healthy provider mid-request — without changing your code.

Nemo Team

2026-06-109 min

Browse all tags →