v1.8 — Observability: logs, callbacks, alerts, data policy
Per-org logging callbacks (Langfuse, Datadog, S3, Slack), eight alert types, full request-log search, and four data-policy modes so you decide what request content we retain.
A gateway that can't show you what happened is a black box. v1.8 ships the four-pillar observability surface — request logs, external callbacks, alerts, and a data policy that puts you in control of retention.
Request logs
/{org}/observability is a paginated log viewer over LiteLLM_SpendLogs. Filter by model, status, virtual key, time range. Expand any row to see the request body, response body, token counts, cost, latency, and which guardrails ran. 90-day retention on Tier 1–3; configurable longer on Enterprise.
Free-text search hits prompt content (subject to your data policy — see below).
Callbacks — ship logs to your stack
/{org}/observability/callbacks configures where we forward request logs in addition to retaining them ourselves. Five sinks supported on day one:
- Langfuse — full trace, prompt template metadata, A/B variant included
- Datadog — structured logs with cost / latency / token metrics
- Amazon S3 — raw JSONL for your data lake, partitioned by date
- Slack — high-signal events only (errors, budget breaches), never per-request spam
- Generic HTTP webhook — your endpoint, your schema, we POST one batch every 10 seconds
Stored in nemo.logging_callbacks. RLS-scoped, secrets encrypted at rest.
Alerts — eight types, one toggle
/{org}/observability/alerts lets you switch on alerting for: LLM errors, budget threshold, budget exceeded, daily-spend anomaly, slow response, outage, key compromise heuristic, and guardrail block burst. Each alert routes through your configured notification channels (email / Slack / Teams / webhook — see notification settings).
Stored in nemo.alerting_settings. Defaults are conservative — we'd rather miss a soft signal than wake you up at 3 a.m. for nothing.
Data policy — your call, not ours
/{org}/observability/data-policy picks one of four modes:
- Zero logging — store no prompt or completion content. Metadata only (model, tokens, cost, status).
- Metadata only — same as zero, kept for naming consistency.
- Full logging — store prompt + completion verbatim. Default for new orgs.
- PII-redacted — store prompt + completion with Presidio redaction applied. Useful for healthcare / fintech orgs that can't justify "full" but want enough content to debug.
The policy applies before the log is persisted, not at read time — so even a compromised viewer role can't pull data the policy excluded.