A small but important honesty pass on the OpenAI-compatible API surface.

What changed

/v1/audio/speech, /v1/audio/transcriptions, /v1/moderations, /v1/rerank, and /v1/ocr are part of the OpenAI-compatible spec and are wired in our router. They previously forwarded to the underlying provider router even when no model supporting that mode was registered, which produced confusing upstream errors — including, in one case, a leaked default-OpenAI fallback string from the moderations path.

These five endpoints now return a clean HTTP 501 endpoint_not_yet_available until at least one model supporting that capability is registered in our router. Response body:

{
  "detail": {
    "error": "endpoint_not_yet_available",
    "endpoint": "/v1/audio/speech",
    "capability": "audio_speech",
    "message": "/v1/audio/speech is not yet available on Nemo Router — no model supporting 'audio_speech' is currently registered. Roadmap: https://nemorouter.ai/changelog",
    "available_endpoints": [
      "/v1/chat/completions",
      "/v1/completions",
      "/v1/embeddings",
      "/v1/images/generations",
      "/v1/images/edits",
      "/v1/videos"
    ]
  }
}

The gate is dynamic — the moment we register a model with the matching capability (e.g. a Whisper deployment for audio_transcription), the endpoint auto-unblocks at the next 30-second cache tick. No code change, no redeploy.

Live today

The seven endpoints below are fully supported by the Google Vertex AI models we ship on day one:

Endpoint	Models
`/v1/chat/completions`	Gemini 2.5 Pro / Flash / Flash-Lite / Flash-Image
`/v1/completions`	(legacy alias of chat completions)
`/v1/embeddings`	gemini-embedding-001, text-embedding-004 / 005, text-multilingual-embedding-002, multimodalembedding@001
`/v1/images/generations`	Imagen 3 / 4 (Fast, Standard, Ultra)
`/v1/images/edits`	Imagen 3 Capability
`/v1/videos`	Veo 3.1 (Fast, Standard, Lite)
`/v1/models`	catalogue lookup

Roadmap

Each capability below ships as soon as we land a provider integration that supports it:

audio_speech (TTS) — landing with OpenAI Direct
audio_transcription (STT / Whisper) — landing with OpenAI Direct
moderation — landing with OpenAI Direct
rerank — landing with Cohere or Voyage
ocr — landing with Mistral OCR or AWS Textract

Track at /changelog.

Discoverability — new endpoint

GET /public/capabilities (no auth) returns a machine-readable map of which endpoints are live right now, so SDKs and integrations can degrade gracefully:

curl https://nemorouter.ai/api/public/capabilities

{
  "endpoints": {
    "/v1/chat/completions": { "available": true },
    "/v1/embeddings":       { "available": true },
    "/v1/audio/speech":     { "available": false, "required_capability": "audio_speech" }
    // …
  },
  "supported_modes": ["chat", "embedding", "image_generation", "video"]
}

Why this matters

The promise of an OpenAI-compatible gateway is "your SDK works, our infrastructure handles the rest". An endpoint that exists on paper but returns a confusing 4xx when called violates that promise. A 501 with a roadmap pointer makes the contract explicit and gives integrations a clean control flow to handle the gap.