Every number a customer sees on NemoRouter eventually traces back to one value: what a request actually cost. It's the most important number in the product and the easiest one to quietly get wrong — in your favor. So we made a principle of it: we read the provider's authoritative cost, we never compute our own. This sounds like an implementation detail. It's actually the foundation everything else stands on, and it's worth saying out loud.

The temptation to invent the number

A gateway could compute cost itself: count tokens, multiply by a price table, done. It's easy, and it's quietly corruptible in three ways:

Stale prices drift from reality — and a stale-low price under-charges, a stale-high over-charges. Either way the number is fiction.
Rounding "conveniences" that always round one direction add up to a margin nobody agreed to.
Hidden markup baked into a computed rate is invisible by construction — you can't see a cut you can't decompose.

The thing is, a computed cost is plausible. It looks right. That's exactly what makes inventing the number dangerous: the customer can't tell a computed estimate from a real charge, so the incentive to fudge is enormous and the detection is near zero.

The principle: read, don't compute

request → provider serves it → provider reports the real cost
                                          │
                          we read that authoritative cost  ← the only number we trust
                                          │
                          settle credits to THAT, exactly

The provider reports the settled cost of the call it actually served. We read that value and bill it, full stop. We don't have a price table to go stale, because we don't price the call — the provider does, and we pass it through exactly, even on streaming responses.

The honest number is also the simpler one

Reading the provider's cost isn't just more honest than computing it — it's less code. No price table to maintain, no per-model rate map to update when a provider changes pricing, no rounding policy to argue about. The honest design and the simple design are the same design, which is the happiest kind of principle.

Why honesty here makes everything else trustworthy

Cost honesty isn't one feature — it's the precondition for the rest of them:

Budgets cap real spend, so a cap means what it says.
Cost attribution sums to the actual bill, so per-customer economics reconcile.
Markup-free credits are only credible if the per-call cost has no hidden cut — which it can't, because we don't set it.
Multimodal floors exist precisely so a missing price reads as a conservative charge, not a dishonest $0.

Pull the honest-cost foundation out and every one of those becomes a number you'd have to take on faith. Keep it, and they're all just arithmetic on a real value.

The same principle, everywhere

Cost honesty is one instance of a broader stance you'll find across the product: no mocks or demo mode (don't fake the system), real tenants and real isolation (don't fake separation), the ledger as source of truth (don't fake the balance). The through-line is a refusal to build a convenient fiction next to the truth. Cost is just where it matters most, because cost is where the incentive to fudge is strongest.

The takeaway

The most important number in an LLM gateway is the cost, and the most important decision is whether to read it or invent it. We read it — the provider's authoritative value, passed through exactly, never a computed estimate we could quietly tilt. That choice is more honest and simpler, and it's what makes budgets, attribution, and markup-free credits trustworthy instead of faith-based. When the number is real, everything built on it can be too.

Cost Honesty: We Read the Number, We Don't Invent It

The temptation to invent the number

The principle: read, don't compute

Why honesty here makes everything else trustworthy

The same principle, everywhere

The takeaway

More from Company

The Business Case for an LLM Gateway

Why We're Managed, Not a Self-Hosted Gateway

Platform Fee, Not Feature Gating