Guardrails

Block, redact, or warn on unsafe content with NemoRouter guardrails

Last updated

Guardrails inspect requests before they reach a model (pre-call) and responses before they reach your user (post-call). They can block, redact, warn, or simply log — letting you enforce safety and compliance without changing your application code.

Guardrail types

TypeWhat it does
Presidio PIIDetects and anonymizes personally identifiable information using Microsoft Presidio
RegexFilters content matching patterns you define
KeywordBlocks content containing words on a blocklist
Prompt InjectionDetects and blocks attempts to hijack the model's instructions
CustomCalls your own webhook to make the decision

Modes and actions

Each guardrail runs in a modepre-call (before the LLM request) or post-call (on the response) — and takes an action when it triggers:

ActionEffect
BlockReject the request and return an error
RedactStrip the sensitive content and continue
WarnAllow the request through, but attach a warning header
LogAllow through and record the event only

Scope hierarchy

Guardrails apply at two scopes, combined per request:

  • Organization — a master kill-switch plus org-wide guardrails that apply to every key.
  • Key — guardrails assigned to specific virtual keys, layered on top of the org rules.

Manage org guardrails on the main Guardrails page and per-key assignments under Guardrails → Keys.

Testing before you ship

Expand any guardrail to find a Test tab: paste sample input and see exactly which action fires and how long it took. This lets you tune a rule against realistic content before it touches live traffic.

Versioning and rollback

Every change to a guardrail is snapshotted. The Versions tab lists each update, and you can roll back to any prior version with one click — useful if a tightened rule starts blocking legitimate requests.

Templates

The Templates gallery offers one-click setups for common patterns — PII redaction, jailbreak/prompt-injection detection, and language blocklists — so you can stand up sensible defaults quickly, then customize.

Guardrail logs

The Guardrails → Logs view records every guardrail evaluation:

ColumnMeaning
TimeWhen the guardrail ran
GuardrailWhich rule evaluated the request
ModePre-call or post-call
ActionBlocked, Redacted, Allowed, Logged, or Error
LatencyHow long the check took (ms)
Request IDCorrelate with the request in observability logs

Filter by guardrail or action, or search by name/request ID.

Next steps

Was this page helpful?