$5 free credits when you sign up
← All posts
Product

Multimodal Cost Safety: Image, Video, and Audio Floors

Image, video, and audio models don't price like text — and a $0 cost reading is a silent revenue leak. Here is how reserve floors and zero-cost gating keep multimodal spend safe.

Nemo Team8 min read

Text models taught everyone to think about LLM cost in tokens. Image, video, and audio models break that intuition: they price per image, per second of video, per minute of audio — and crucially, a brand-new multimodal model is often missing from the pricing map entirely, which makes its cost read as $0. On a gateway, $0 isn't "free," it's a silent revenue leak: you serve an expensive generation and charge nothing. Here's how multimodal cost safety has to work.

Why multimodal pricing is different

ModalityPriced byText-style token math
Textinput/output tokensworks
Imageper image (size/quality)breaks
Videoper second generatedbreaks
Audioper minute / per charbreaks

A token-count cost model returns nonsense — or zero — for these. And the highest-risk case is the newest, most expensive model: a just-shipped flagship video model that isn't in the pinned price map yet. Its cost header comes back 0, the reserve-and-settle settles $0, and you've given away the single most expensive call your gateway can make.

The hard rule: never serve a model at $0

The invariant is absolute — no enabled model is ever served for free. Two mechanisms enforce it:

  1. Reserve floors per modality. Before an image/video/audio call, the gateway reserves a floor cost appropriate to the modality, not a token estimate. So even if the settled cost would be zero, a real amount was held.
  2. Floor-on-zero-cost at settle. If the authoritative cost comes back 0 (model missing from the price map), the gateway settles the floor instead of zero — the call is never free.
multimodal request
  ├─ reserve modality FLOOR (not token math)
  ├─ forward → settle real cost
  └─ if settled cost == 0  → charge the floor, never $0

$0 is the leak, not the deal

A zero-cost reading on a multimodal call almost always means "this model isn't priced yet," not "this generation was free to produce." Treating $0 as real means under-charging on exactly the most expensive calls. The floor turns a silent leak into a conservative charge until an exact price is wired in.

Auto-gating models that can't be priced safely

Some models can't be served safely yet — the provider isn't provisioned, or there's no defensible cost. Rather than serve them at a guess, the gateway auto-gates them: a request for an ungated-but-unpriceable model returns a clean 501-class response instead of silently running. Better an honest "not available yet" than an expensive call billed at zero. When the model is properly priced and provisioned, the gate lifts.

This is the operational face of a platform rule: always offer the latest models, but never at a $0 price — a just-shipped flagship gets an explicit cost override the moment it's enabled, so there's no window where it's both live and free.

Why this is a product feature, not just plumbing

For anyone reselling or budgeting on top of the gateway, multimodal cost safety is what makes the economics trustworthy. Your per-customer billing can't be right if image generations bill at zero; your budgets can't protect you if the priciest calls don't count against them. Floors and gating mean every modality lands in the same exact-cost accounting as text — so a budget, an attribution tag, and an invoice all stay honest when a customer starts generating video.

The takeaway

Multimodal models don't price like text, and the dangerous failure mode is a missing price reading as $0 on your most expensive calls. The fix is structural: reserve a modality floor before the call, settle the floor if the real cost comes back zero, and auto-gate models that can't be priced safely yet. The rule "no enabled model served at $0" is what keeps budgets, attribution, and invoices honest the moment your app goes beyond text. Browse the multimodal models to see what's live.

Written by Nemo TeamEngineering, product, and company posts from the Nemo Router team — code-first, cost-honest, no vendor-marketing fluff.