A concept called LLM Wiki crossed my feed last night. Andrej Karpathy’s framing: instead of RAG retrieving document snippets on demand, an AI agent maintains a living folder of Markdown files — synthesising new information, updating existing pages, gardening the knowledge base over time. Stateful, not stateless. The library reorganises itself.

It is the right idea.

It also describes a version of the problem I have been living inside for a year, framed in a way that misses the part of it that actually matters in production.

The difference between a digital librarian and a control plane

The Karpathy model is elegant. An AI that does not just retrieve knowledge but maintains it — noticing when Document A contradicts Document B, updating cross-references when new information arrives, keeping the whole thing coherent without manual filing.

In that model, if a wiki page goes stale, you get a slightly wrong answer.

In Axia, if a documentation rule goes stale, the build agent follows the wrong instruction on the next autonomous run — and you find out three weeks later when something silently no-ops in production.

That is not a retrieval problem. It is a control problem.

CLAUDE.md is not a reference document. It is the instruction contract the build agent reads before every session. Architecture decision records are not notes. They are the rationale that governs every subsequent decision in the same domain. The module graph is not a map. It is the dependency check that determines whether a build is safe to ship.

When documentation drifts in a system like this, it does not just confuse humans. It misdirects autonomous execution.

Side-by-side comparison of two AI documentation maintenance approaches: digital librarian comparing documents to documents and catching contradictions with stakes of slightly wrong answers, versus control plane comparing documents to production systems and catching drift between intent and behaviour with stakes of silently misdirected autonomous execution — Digital librarian vs control plane — two ways to maintain AI documentation.

What the LLM Wiki framing gets right

The stateful insight is correct, and it is the right reframe of where retrieval-augmented generation has been stuck. RAG treats documentation as a passive corpus the model dips into. LLM Wiki treats it as a living artefact the agent maintains. That shift in posture — from snapshot retrieval to ongoing curation — is the move the field needed.

For human-facing knowledge bases, this is unambiguously an upgrade. A wiki that updates as the system changes is fundamentally more useful than a wiki that snapshots a moment and then ages out.

The reason it is not enough — for the systems Axia actually runs — is upstream of any of the framing.

Where the framing stops short

The LLM Wiki concept does not address execution risk. When the knowledge base governs autonomous agents, the documentation is not for human recall. It is the substrate the agent reasons from. Every rule that drifts is a quiet instruction the agent will follow until somebody catches it.

The synthesis step a digital-librarian wiki does well — Document A contradicts Document B; flag for review — solves the wrong half of the problem. The harder half is upstream: the documentation says we deploy through this trigger, but production deploys through a different one. The drift is between the documentation and the system, not between two documents.

Catching that requires verification scripts that check whether documented state matches deployed state — after every branch operation, before any autonomous build session, on a cadence that matches the system’s own change rate. The wiki cannot self-detect this kind of drift because it cannot see the production behaviour. It can only see what other documents say.

In other words: a digital librarian keeps the catalogue tidy. A control plane verifies the catalogue against reality.

What Axia is, named directly

Axia’s documentation layer is closer to a living constitution than a wiki. Rules that govern execution, not just reference. Verification scripts that compare documented intent to production behaviour, not just documents to other documents. Architectural decision records that act as binding precedent for future decisions in the same domain, not searchable historical notes.

The discipline that holds it together is structural, not editorial. Drift gets caught by automated comparison, not by an agent’s heuristic for “this looks inconsistent.” Deferred cleanup gets named as its own scoped commit, not folded into the next feature. Retired vocabulary gets tombstoned with explicit guidance on what replaced it, so the next verification pass does not re-discover what this one already learned.

None of this is novel engineering doctrine. Stripe’s migration guidance describes a three-phase pattern: prepare, migrate, clean up. Airbnb treats cleanup as its own first-class commit. Webflow ran the same migration tool twice on the same codebase, a year apart, because the second pass caught what the first pass left behind. Mature engineering teams do not expect migrations to ship complete. They plan the cleanup phase explicitly.

What is novel — or at least underexplored — is what happens when the documentation those teams maintain is not just a reference for engineers but the contract an autonomous agent reads before every action it takes.

The forward look — autonomous knowledge gardening

The next layer Axia is building is the synthesis step that does not exist yet in either RAG or LLM Wiki framings. It looks something like this:

The system notices when production behaviour diverges from documented intent — without being asked. It surfaces the divergence to the operator with proposed reconciliations. It generates the cleanup commit, scoped and named. It tombstones the retired language. It updates the cross-references downstream. The operator approves, edits, or overrides; the system handles the execution.

The right name for this is autonomous knowledge gardening. Not retrieval. Not curation. Not even self-updating documentation. The system maintaining its own contract with itself, with the operator holding the approval gate at every commercial decision.

The infrastructure for it is already in place. The verification scripts exist. The vocabulary enumeration patterns exist. The tombstone discipline exists. What is left is the unprompted-divergence-detection layer — the part where the system notices, rather than waits to be asked.

That is on the roadmap.

What this means for anyone running production AI agents

If you are operating AI agents at any scale — autonomous outreach, classification, drafting, scheduling, anything where the agent acts without per-step human approval — the question worth asking is not how good is my prompt library. It is what verifies that the instructions my agents are following match the system they are operating in.

That is the question Axia was built to answer at SME scale. The verification layer that catches the drift between what the documentation says and what the production system does, the operator approval gate that sits above every commercial decision, the discipline of treating documented rules as binding rather than aspirational. The control plane, not the digital librarian.

For the build conversation about how this applies to your specific operations, Axia is the production layer that runs that discipline at the scale SMEs actually need.

Alan Law is founder of V8 Global and architect of Axia. Leadership Insight posts cover the strategic infrastructure SME owners need to operate AI-native businesses without becoming the production layer themselves.

Axia

Ready to take the next step?

Most AI is sold as a tool. Axia is built as the operating layer your business actually runs on.

How Axia operates with verification built in

Karpathy's LLM Wiki Is a Digital Librarian. What Axia Runs Is a Control Plane.