v1.2 — Prompt management with A/B testing
Server-side prompt templates with versioning and Jinja2 variables, plus deterministic traffic-split A/B tests. Same hash, same variant — every time.
Stop hardcoding prompts in client code. v1.2 ships server-side prompt management and deterministic A/B testing — both gated through the same single endpoint your SDK already calls.
Prompt templates
Create a template once, reference it by ID from any request. Templates are versioned: editing a template creates a new version automatically, and you can pin a request to a specific version (prompt_version) or always take the latest (default).
Templates support Jinja2 variables. Pass values via extra_body:
{
"model": "gemini-2.5-flash",
"extra_body": {
"nemo_prompt_template_id": "summarizer_v2",
"nemo_prompt_variables": { "doc_type": "research paper", "max_words": 200 }
}
}Per-template cost and token usage roll up in the dashboard so you can compare "summarizer_v2" vs "summarizer_v3" without instrumenting your own analytics.
A/B testing — deterministic
The A/B engine (nemo_backend/prompts/ab_test_engine.py) hashes a stable key (request ID, user ID, or org-scoped seed) and splits traffic by configured percentages. Same hash, same variant, every time — so the same user sees the same variant across requests, and your analytics aren't poisoned by re-bucketing.
Variants can swap the model (gpt-5 vs gemini-2.5-pro), the prompt template, or both. A test is a state machine: draft → running → paused → completed. While running, the per-variant cost, latency, and error rate are visible in real-time on the prompts dashboard.
A/B tests are not overridable per-request — the whole point is determinism. If you need a control-group bypass, create a virtual key outside the experiment scope.
Manage prompts + experiments at /{org}/prompts.