$5 free credits when you sign up
Prompts

Your team’s prompts deserve their own database, not a code repo.

Versioned templates with Jinja2 variables, A/B testing, AI recommendations, and per-key selection. Ship prompt changes without redeploying — and roll back in one click when they regress.

prompt-library · acme-inc

Active templates

support-agent.jinja2v3 · active
code-reviewer.jinja2v7 · active
sql-explainer.jinja2v2 · A/B 60/40
onboarding-bot.jinja2v1 · draft
Total versions tracked47
Avg activation latency< 50 ms
Jinja2versionedaudit-logged
Versioning
Every save

Snapshot + diff + author + timestamp

Per-key selection
extra_body

nemo_prompt_template_id on any request

A/B variants
Deterministic

Hash-routed by request id

Prompt logs
Opt-in

Honors org-level data policy

Capabilities

Write, test, ship — without redeploying

Template management with Jinja2 variables, version history, A/B testing, AI-powered recommendations, and a per-key selection model that works with the unmodified OpenAI SDK.

Jinja2 templates with variable injection

Author prompts with double-brace variables, conditionals, loops, filters, and default values. Inject runtime values via `nemo_prompt_variables` — never hardcode prompt text in your repo again.

  • Conditional logic, loops, filters, defaults
  • Variables resolved per-request, never compiled into code
  • Whitespace-aware rendering preserves model-friendly formatting
  • Linter flags missing variables before activation

Version history with one-click rollback

Every save creates a versioned snapshot with author, timestamp, and diff. Activate any version instantly — no redeploy. If a change regresses output quality, roll back from the dashboard in one click.

  • Linear version chain with active-version pointer
  • Diff view between any two versions
  • Activate / deactivate without touching code
  • Soft-delete preserves audit history

A/B testing of prompt variants

Split traffic across two or more variants with a deterministic hash on request id. Each branch logs latency, cost, and response metadata so you can promote the winner with statistical confidence.

  • Configurable traffic split (e.g. 60/40, 33/33/34)
  • Deterministic per-request — same input, same variant
  • Per-variant rollups in the analytics dashboard
  • Lifecycle: draft → running → paused → completed

AI-powered recommendations

Get suggestions to clarify ambiguous instructions, cut token use, or fix common patterns that hurt model output. Recommendations are generated, never auto-applied — you stay in control of every change.

  • Clarity rewrites with predicted quality lift
  • Token-reduction passes with -% estimate
  • Detects passive voice, ambiguous roles, missing constraints
  • Diff preview before any change is saved

Per-request override via the SDK

Apply any template at request time by passing `nemo_prompt_template_id` and `nemo_prompt_variables` in `extra_body`. No config flags, no redeploy — switch the prompt mid-request if you need to.

  • Works with the unmodified OpenAI SDK
  • Per-key default template if none is supplied
  • Stripped from the upstream payload before forwarding to the provider
  • Selectable from any client SDK in the same way

Audit trail for every change

Who changed what, when, and from where. Every prompt save, activation, A/B start, and rollback is captured in the immutable audit log alongside the actor, IP, and full diff.

  • Append-only audit log with configurable retention
  • Filter by template, actor, action class, time window
  • CSV + JSON export for SIEM ingestion
  • Pinned for SOC 2 evidence collection
Variants & Recommendations

Test prompts the way you test features

Don’t guess which version is better — measure it. Split traffic, capture the metrics that matter, and promote the winner. AI recommendations close the loop with predicted lift before you even ship.

A/B testing

Deterministic splits, statistical confidence

Create two or more variants with a configurable traffic split. Routing is hash-deterministic per request id, so the same input always lands on the same variant. Per-branch metrics roll up in the analytics dashboard.

  • Lifecycle: draft → running → paused → completed
  • Per-variant cost, latency, p95, and quality metrics
  • Promote the winner from the dashboard without code change
  • Stop the test instantly if a variant regresses
ab_test · sql-explainer

Variant performance (last 24h)

Variant A · traffic60%
Variant A · avg cost$0.018
Variant A · p95 latency1 240 ms
Variant B · traffic40%
Variant B · avg cost$0.014
Variant B · p95 latency980 ms
runningdeterministic splitlogged
AI Recommendations

Suggestions before they ship — never auto-applied

Closed-loop recommendations

Predicted lift, token reduction, ambiguity flags

Recommendations are generated, scored, and surfaced — but never auto-applied. You see the diff, the predicted impact, and the reasoning. Accept, edit, or dismiss; the audit trail captures every choice.

  • Clarity rewrites with predicted quality lift (e.g. +23%)
  • Token-reduction passes with -% estimate
  • Passive-voice and ambiguous-role detection
  • One-click apply creates a new version (never overwrites)
recommendations · support-agent v3

Open suggestions

Rewrite for clarity+23% lift
Token reduction-15% in
Add explicit role boundariesopen
Remove passive voice (3 spots)open
Auto-appliednever
suggestion onlydiff-previewedaudit-logged
Lifecycle

Per-request selection via extra_body

Per-key + per-request

Pick a template at request time, with the unmodified SDK

Pass `nemo_prompt_template_id` and `nemo_prompt_variables` in `extra_body` on any chat completion. Nemo Backend reads the template, renders with your variables, and forwards a clean payload to the provider — the prompt fields never leave the gateway.

  • Works with OpenAI Python, Node.js, Go, Ruby, Java, C#, PHP, Rust
  • Per-key default template if no override is supplied
  • Stripped from upstream payload before forwarding to the provider
  • Honors the org’s data policy for logging granularity
POST /v1/chat/completions

Per-request override

extra_body.nemo_prompt_template_idtpl_abc123
extra_body.nemo_prompt_variables.roletutor
extra_body.nemo_prompt_variables.langEnglish
Renders with templatesupport-agent v3
Forwarded to providersanitized
Logged per data policymetadata-only
SDK-nativesanitizeddata-policy aware
Pulling prompts out of the codebase and into a versioned template store is the single biggest velocity unlock we have shipped this year. PMs change copy now; engineering ships features.

Engineering Manager

Mid-market SaaS, agent-platform team

Storage

Prompt data lives in your org — nowhere else

Templates, versions, and variable payloads are stored as org-scoped rows in nemo.prompt_templates and nemo.prompt_versions, both protected by Row-Level Security. Cross-org reads are impossible at the database layer.

Tenant isolation

Every prompt row is RLS-scoped to your org

The same UUID flows through every layer. Prompt rows reference organization_id; RLS policies on every table refuse cross-tenant reads. The application can have bugs; the database refuses to return rows from the wrong tenant.

  • nemo.prompt_templates + nemo.prompt_versions — RLS enabled
  • Owner / Admin / Member / Viewer policies enforced per row
  • Soft-delete preserves audit history without exposing data
  • Org delete cascades to every prompt row + version
psql · nemo schema

Prompt-related tables

nemo.prompt_templatesRLS ✓
nemo.prompt_versionsRLS ✓
nemo.prompt_recommendationsRLS ✓
nemo.prompt_logsRLS ✓ · opt-in
Cross-tenant denials (24h)0
RLSorganization_idsame-UUID
FAQ

Common prompt questions

Ship without redeploying

Move every prompt out of the codebase — today

Free to start. Versioning, variants, recommendations, and the audit trail are unlocked on every plan from day one.