v1.0.1 — Capability discovery + honest 501s
Endpoints that aren't yet wired now return a clear HTTP 501 with a roadmap pointer instead of a confusing upstream error. New /public/capabilities endpoint lets SDKs introspect what's live.
A small but important honesty pass on the OpenAI-compatible API surface.
What changed
/v1/audio/speech, /v1/audio/transcriptions, /v1/moderations, /v1/rerank, and /v1/ocr are part of the OpenAI-compatible spec and are wired in our router. They previously forwarded to the underlying LiteLLM proxy even when no model supporting that mode was registered, which produced confusing upstream errors — including, in one case, a leaked default-OpenAI fallback string from the moderations path.
These five endpoints now return a clean HTTP 501 endpoint_not_yet_available until at least one model supporting that capability is registered in our router. Response body:
{
"detail": {
"error": "endpoint_not_yet_available",
"endpoint": "/v1/audio/speech",
"capability": "audio_speech",
"message": "/v1/audio/speech is not yet available on Nemo Router — no model supporting 'audio_speech' is currently registered. Roadmap: https://nemorouter.ai/changelog",
"available_endpoints": [
"/v1/chat/completions",
"/v1/completions",
"/v1/embeddings",
"/v1/images/generations",
"/v1/images/edits",
"/v1/videos"
]
}
}The gate is dynamic — the moment we register a model with the matching capability (e.g. a Whisper deployment for audio_transcription), the endpoint auto-unblocks at the next 30-second cache tick. No code change, no redeploy.
Live today
The seven endpoints below are fully supported by the 18 models we ship on day one:
| Endpoint | Models |
|---|---|
/v1/chat/completions | Gemini 2.5 Pro / Flash / Flash-Lite / Flash-Image |
/v1/completions | (legacy alias of chat completions) |
/v1/embeddings | gemini-embedding-001, text-embedding-004 / 005, text-multilingual-embedding-002, multimodalembedding@001 |
/v1/images/generations | Imagen 3 / 4 (Fast, Standard, Ultra) |
/v1/images/edits | Imagen 3 Capability |
/v1/videos | Veo 3.1 (Fast, Standard, Lite) |
/v1/models | catalogue lookup |
Roadmap
Each capability below ships as soon as we land a provider integration that supports it:
audio_speech(TTS) — landing with OpenAI Directaudio_transcription(STT / Whisper) — landing with OpenAI Directmoderation— landing with OpenAI Directrerank— landing with Cohere or Voyageocr— landing with Mistral OCR or AWS Textract
Track at /changelog.
Discoverability — new endpoint
GET /public/capabilities (no auth) returns a machine-readable map of which endpoints are live right now, so SDKs and integrations can degrade gracefully:
curl https://nemorouter.ai/api/public/capabilities{
"endpoints": {
"/v1/chat/completions": { "available": true },
"/v1/embeddings": { "available": true },
"/v1/audio/speech": { "available": false, "required_capability": "audio_speech" }
// …
},
"supported_modes": ["chat", "embedding", "image_generation", "video"]
}Why this matters
The promise of an OpenAI-compatible gateway is "your SDK works, our infrastructure handles the rest". An endpoint that exists on paper but returns a confusing 4xx when called violates that promise. A 501 with a roadmap pointer makes the contract explicit and gives integrations a clean control flow to handle the gap.