$5 free credits when you sign up
Playground

Test any model. Right from your browser.

An interactive playground for 20+ models. Chat, tweak parameters, compare responses, watch cost + latency in real time, and export integration code in 15 languages — without writing a single line of integration code.

playground · gemini-2.5-flash

Active session

Auth modesk-nemo-...x4kf
Key storagesessionStorage
Streamingon (SSE)
Tokens streamed142
Cost so far$0.0021
Latency (first token)420 ms
Code generators15 languages
streamingnon-streamingtoolsjson modecancel
Server-side key storage
0

Your key lives in sessionStorage only

Models you can test
20+

Google Vertex AI

Streaming + non-stream
Both

Real-time tokens or full-response mode

Code-gen languages
15

8 SDKs + 7 frameworks

Capabilities

From prompt to production, instantly

The playground is the same request path as your SDK — just with a UI on top. What you test is what you ship.

Virtual key, sessionStorage only

Paste your sk-nemo-... virtual key into the playground. It is stored in browser sessionStorage and cleared on tab close. The master key never enters this flow — every LLM call uses your virtual key, just like production.

  • No server-side persistence — refresh = re-paste
  • Cleared on tab close (sessionStorage scope)
  • Same auth path as your SDK calls
  • Per-key spend, RPM, TPM enforced exactly as in prod

Full chat interface

System prompts, multi-turn history, message edit + retry, model swap mid-conversation, copy individual messages. Every interaction shows tokens, cost, and latency in real time.

  • Multi-turn conversation with full history
  • System prompt editor + role-based message authoring
  • Per-message metrics (tokens, cost, latency)
  • Cancel mid-stream to save on token cost

Streaming + non-streaming

Toggle between server-sent-events streaming (default) and full-response mode. Streaming gives you token-by-token feedback; non-streaming returns the full completion in one block — same path either way.

  • SSE streaming with mid-stream cancel
  • Non-streaming for batch / eval workflows
  • Identical billing — both modes use the same pipeline
  • Visible token-rate counter while streaming

Parameter pass-through

Visual sliders for temperature, top_p, max_tokens, presence and frequency penalties. Every parameter is forwarded as-is to the provider, so playground behavior matches production exactly.

  • temperature · top_p · max_tokens · presence/frequency penalties
  • response_format (json_object) for structured output
  • tools / function-calling parameter pass-through
  • No silent defaults — what you set is what ships

Auto-generated integration code

Every session emits ready-to-paste code in 15 languages — Python, Node.js, Go, Ruby, Java, C#, PHP, cURL, plus 7 frameworks (LangChain, CrewAI, AutoGen, ADK, etc.). Drawn from the canonical SDK examples package.

  • 8 SDKs + 7 frameworks (LangChain, CrewAI, AutoGen, ADK, more)
  • Reads from `04-nemoroutersdk` — never hardcoded snippets
  • Embeds your exact prompt + parameter set
  • Copy-to-clipboard with one click

402 over-credit surfacing

When your account is out of credits, the playground surfaces the 402 cleanly with a one-click link to top up. No silent failures, no half-streamed responses — just the same error you would see in your SDK.

  • 402 surfaced with topup CTA inline
  • 429 (rate-limit) surfaced with retry-after timer
  • 4xx provider errors surfaced verbatim
  • Streaming aborts cleanly on mid-stream errors
Security Model

Your key never leaves the browser

The playground is a deliberate zero-trust surface. We do not want — and do not store — your virtual key on any NemoRouter server. The master key is reserved for management CRUD and is never exposed to this UI.

Virtual key only

sessionStorage scope, master key isolated

Per Rule #15 of our codebase, every LLM call — playground, SDK, or production API — must authenticate with a virtual key (sk-nemo-...). The playground enforces this by holding the key only in browser sessionStorage. Refresh the tab and you re-paste; close the tab and the key is gone.

  • Browser sessionStorage scope (cleared on tab close)
  • Never POSTed to NemoRouter for persistent storage
  • Master key never enters the playground flow
  • Per-key spend, RPM, TPM enforced exactly as in production
  • Per-key revocation: kill a leaked key in one click
security · session

Storage + auth posture

Key in sessionStorageyes
Key in localStoragenever
Key in cookiesnever
Server-side persistence0 paths
Master key in this flowisolated
Cleared on tab closealways
Rule #15sk-nemo-... onlymaster key isolated
Request Pipeline

The same path as production — no sandboxing

Every playground request uses your real virtual key and traverses the same auth, credit-reserve, guardrail, and provider chain as your SDK. There are no fake responses, no mock providers, no silent shortcuts.

Playground request flow

  1. Paste key

    sessionStorage only

    Cleared on tab close; never sent to NemoRouter servers.

  2. Frontend (Next.js)

    :3001 — Cloud Run

    Forwards request with the user-pasted virtual key.

  3. Nemo Backend

    :8090 — FastAPI

    Auth, credit reserve, guardrails — same as production.

  4. Nemo Intelligent Proxy Router

    In-process ASGI

    Cost tracking, rate limits, provider fallback.

  5. Provider

    Vertex / Anthropic / OpenAI

    Real LLM call — no sandboxing, no fake responses.

The Nemo Intelligent Proxy Router runs in-process inside Nemo Backend — no separate gateway service, no extra hop. The playground hits the same ASGI app your SDK does.

Use Cases

Built for every stage of development

Prototype before you build

Test models to find the best fit for your use case before writing any integration code. Compare quality, speed, and cost across providers in minutes instead of days. Validate edge cases before committing.

Compare model responses

Run the same prompt against multiple models for side-by-side testing. Compare Gemini 2.5 Pro, Flash, Flash-Lite, and Imagen — Anthropic, OpenAI, and Bedrock ship next, no config changes when they land.

Fine-tune parameters

Experiment with temperature, top_p, max_tokens, and other generation parameters using visual sliders. See instantly how creativity, determinism, and length affect output quality.

Generate integration code

Every session auto-generates code in 15 languages — Python, Node.js, Go, PHP, Java, Ruby, C#, cURL, plus LangChain, CrewAI, AutoGen, and ADK. Copy your exact prompt config as production-ready code.

Model Catalog

Every model in your account, available in one click

Switch mid-conversation

20+ models from the same dropdown

Switch models mid-conversation. Compare responses across providers without changing a thing. Today: Google (Gemini, Imagen, Veo) — Anthropic, OpenAI, and AWS Bedrock shipping next. As we onboard new providers, your playground gains them automatically — no config, no flag.

  • Filter by capability (chat, vision, embedding, image, video)
  • Filter by provider, latency budget, cost ceiling
  • Pin favorites to the top of the dropdown
  • Real-time price-per-1k tokens visible per model
catalog · live

Active model lineup

gemini-2.5-proVertex AI
gemini-2.5-flashVertex AI
gemini-2.5-flash-liteVertex AI
imagen-4.0Vertex AI
veo-3.0Vertex AI
Total live18 models
chatvisionembeddingimagevideo
FAQ

Common playground questions

20+ models · 15 code generators · 0 server-side keys

From idea to working prototype in under a minute

No SDK to install, no provider account to manage. Sign up, paste your virtual key, start chatting.