Redacting PII From LLM Logs Without Losing Debuggability
You can keep useful request logs and still not warehouse customer PII. Here is how write-time redaction, structure-preserving placeholders, and metadata scrubbing keep LLM logs debuggable but safe.
The usual objection to redacting logs is "but then I can't debug." It's a false trade. Done right, redaction removes the values you can't afford to store while preserving the structure you need to reproduce a bug. This post is the engineering of that: where redaction has to happen, how to keep logs useful afterward, and the metadata leak everyone forgets.
Why read-time redaction is a lie
The tempting shortcut is to store everything and mask it in the UI. It does not work, for a simple reason: the data is already at rest. A read-time mask protects exactly one access path — the dashboard — while leaving the raw value exposed to:
- anyone with direct database access,
- every backup and replica,
- any future export, migration, or analytics job,
- a SQL injection or credential leak.
Read-time masking is a UI feature pretending to be a security control. The only redaction that counts happens before the write, so the sensitive substring never lands in durable storage in the first place.
request → [PII detection] → redacted content → log row written
│
└─ raw value discarded here, never persistedRedact values, preserve structure
Naive redaction ("[REDACTED]" for the whole prompt) destroys debuggability — you can't tell what shape of request failed. Structure-preserving redaction replaces only the detected entities, in place:
before: "Email john.doe@acme.com about invoice 4471, card 4111 1111 1111 1111"
after: "Email [EMAIL] about invoice 4471, card [CARD]"The bug is still reproducible: you can see it was an email-plus-invoice-plus-card prompt, the token shape is intact, the non-sensitive 4471 survives. What's gone is precisely the data you couldn't keep. Typed placeholders ([EMAIL], [PHONE], [SSN], [CARD]) even tell you what kind of entity was there, which is often enough to debug without the value.
Redaction and guardrails share an engine
The same PII detection that powers request guardrails powers log redaction — one detector, two consumers. The guardrail redacts before the prompt reaches the provider; the log policy redacts before it reaches your log store. Same values, two exits, both covered.
The metadata leak everyone forgets
You can redact the prompt perfectly and still leak PII through the side door: request metadata. Headers and metadata blobs routinely carry the requester's IP address, which is personal data and often regulated. A system that scrubs the prompt body but ships the IP in metadata.requester_ip_address has redacted the obvious field and leaked the same class of data one key over.
The fix is to treat metadata as content for redaction purposes: strip requester IP (and similar identifying headers) for non-admin readers, in the same write-time pass. Redaction is only complete when every copy of the sensitive value is handled — body, headers, and the metadata sidecar.
Don't re-leak in the event log
Redaction creates a second temptation: logging what you caught. "Redacted email john.doe@acme.com" in an audit line has just re-stored the exact thing you redacted. Record the event — which detector fired, what type, which scope, the action taken — never the raw matched value. You want to answer "how often did the card detector fire on this key" without any row containing a real card number.
What you keep, what you drop
| Kept (useful, safe) | Dropped (sensitive) |
|---|---|
| Prompt structure + typed placeholders | Raw emails, phones, SSNs, cards |
| Model, tokens, cost, latency | Requester IP (non-admin) |
| Status codes, error types | Identifying headers |
| Which detector fired (event) | The matched value itself |
That's a log you can debug with and still hand to a privacy review without flinching.
The takeaway
Redaction isn't the enemy of debuggability — value-destroying redaction is. Redact at write time so nothing sensitive persists, replace entities in place with typed placeholders so structure survives, scrub the metadata side door as carefully as the body, and never re-leak the matched value into the event log. You end up with request logs that are genuinely useful and genuinely safe — not one at the expense of the other. The data-policy docs show how to set the level per org.