A Tamper-Evident Audit Log for AI Usage, in Python

Once you put an AI system into real use, someone eventually asks the awkward question. What did it actually do? Who asked it what, when, which model answered, and what did it say? When I wrote about what the EU AI Act means for risk and compliance teams, record-keeping was one of the recurring themes, and it’s the same story for a SOC 2 audit. You’re expected to have a trail.

Most teams do log their AI calls. The gap is subtler than “we have no logs.” It’s that a plain log file can be quietly edited. If someone can open the file and change a line, delete an embarrassing request, or soften a risk rating after the fact, then the log proves nothing. An auditor’s real question isn’t “do you have a log,” it’s “how do you know the log is telling the truth.”

You can close that gap with a small amount of Python and an idea borrowed from how blockchains stay honest. You chain each record to the one before it with a hash, so any later edit breaks the chain in a way you can detect.

The idea in one paragraph

Every entry in the log carries a fingerprint, a SHA-256 hash, of its own contents plus the fingerprint of the entry before it. Because each record’s hash depends on the previous one, the records are linked in a chain. Change anything in an old record and its hash no longer matches, and because the next record was built on that hash, everything after it breaks too. You can’t quietly edit the middle of the log without the damage showing.

This doesn’t stop someone editing the file. Nothing on the same disk can. What it does is make the edit obvious, which for an audit trail is the property that actually matters.

The script

Here’s the whole thing. Save it as ailog.py.

#!/usr/bin/env python3
"""ailog - a tamper-evident, append-only audit log for AI usage."""
import hashlib
import json
import time

GENESIS = "0" * 64


def _hash(prev_hash: str, payload: dict) -> str:
    body = prev_hash + json.dumps(payload, sort_keys=True, separators=(",", ":"))
    return hashlib.sha256(body.encode("utf-8")).hexdigest()


class AuditLog:
    def __init__(self, path):
        self.path = path

    def _last_hash(self):
        try:
            with open(self.path) as fh:
                lines = [ln for ln in fh if ln.strip()]
            return json.loads(lines[-1])["hash"] if lines else GENESIS
        except FileNotFoundError:
            return GENESIS

    def append(self, user, model, prompt_redacted, response, risk):
        payload = {
            "ts": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
            "user": user,
            "model": model,
            "prompt_redacted": prompt_redacted,
            "response_sha256": hashlib.sha256(response.encode("utf-8")).hexdigest(),
            "risk": risk,
        }
        prev = self._last_hash()
        record = {**payload, "prev": prev, "hash": _hash(prev, payload)}
        with open(self.path, "a") as fh:
            fh.write(json.dumps(record) + "\n")
        return record

    def verify(self):
        prev = GENESIS
        with open(self.path) as fh:
            for i, line in enumerate(fh):
                if not line.strip():
                    continue
                record = json.loads(line)
                payload = {k: record[k] for k in record if k not in ("prev", "hash")}
                if record["prev"] != prev or _hash(prev, payload) != record["hash"]:
                    return False, i
                prev = record["hash"]
        return True, None

A couple of details worth pointing out. The prompt is stored already redacted, which pairs neatly with the redaction gate I wrote about last time, so the audit trail itself never becomes a second copy of your customers’ personal data. And the response is stored as a hash rather than in full. That’s deliberate. You can prove a given answer is exactly the one that was logged by hashing it and comparing, but you’re not keeping every word the model ever said sitting in a log file forever.

Watching it catch a tampered record

The demo writes three honest entries, checks the chain, then edits a past record the way a person covering their tracks might, and checks again.

After 3 honest writes:  verify -> True
After tampering line 2: verify -> False  (breaks at record index 1)

The tamper in this case was changing a blocked, high-risk request to look like a harmless low-risk one, exactly the kind of after-the-fact edit you’d want to be impossible to hide. The first check passes cleanly. Then one character changes in the middle record, and verify doesn’t just say “something’s wrong,” it points at record index 1, the precise entry that was altered. That specificity is the difference between a vague worry and an actual finding.

Where this fits in a real setup

On its own this is a file on a disk, and a determined insider with write access could recompute the whole chain from the point they edited. The hash chain makes casual tampering visible, and you harden it from there with things you’d want anyway.

Write the log somewhere append-only, such as object storage with versioning or a write-once bucket, so old versions can’t simply be overwritten. Periodically take the latest hash, the tip of the chain, and copy it somewhere separate, even a second system or a printout, because once that fingerprint lives outside the file, nobody can rewrite history without contradicting the copy you already kept. Neither is much work, and together they turn “hard to edit unseen” into “practically impossible to edit unseen.”

Why an auditor cares

The reason this is worth the effort is that it changes what you can claim. “We log our AI usage” invites the follow-up “and how do I know nobody changed it.” With a chained log the answer is a command. You run verify, it walks the whole history, and it either confirms the record is intact or tells you exactly where it isn’t. For EU AI Act record-keeping, or SOC 2, that’s the difference between asserting integrity and demonstrating it.

Conclusion

An audit log is only as good as your confidence that it hasn’t been touched. Chaining each entry to the last, with a hash any tampering will break, gives you that confidence for a page of Python and no new infrastructure. Store the tip of the chain somewhere safe, keep the file append-only, and you’ve got a record you can actually stand behind when someone asks.

If you end up adapting this, I’d like to hear where you chose to anchor the chain tip. That’s the part where everyone’s setup differs, and it’s the part that makes it real.