Bi-temporal memory for AI coding agents — the 1990s database pattern that fixes "my agent forgot what we decided yesterday"
Most "memory" features for AI coding agents are flat key-value stores: when you update a memory, the old value is gone. That's the wrong abstraction for a codebase, where the team's beliefs change and the question "what did we think about auth at commit abc123?" is a real one. Sverklo borrows a 30-year-old database pattern — bi-temporal memory — and pins it to git SHAs. Here's how it works, why it matters, and the SQLite schema.
The failure mode
Last month I was debugging an auth flow with Claude Code. We made a deliberate decision: JWT verification happens in the middleware, not the route handler. The reason was specific — we wanted token validation to short-circuit before the route's body parser ran, which mattered for a CSRF protection we'd added two weeks earlier.
Three days later, in a fresh session, Claude proposed moving JWT verification into the route handler. It hadn't seen the previous decision. Context was compacted. The agent had no way to retrieve "we decided X for reason Y."
The conventional fix is a CLAUDE.md file with project invariants. Works for stable rules ("never use any in TypeScript") but not for evolving design decisions. The CLAUDE.md would either:
- Become a wall of every decision the team ever made (unreadable)
- Get manually pruned when decisions changed (loses history)
- Get auto-pruned by some rule (which wins on conflict?)
None of these answer "what did we believe at this specific commit?" — and that's the question that comes up when you're triaging a bug introduced in some past version.
Bi-temporal databases — the 1990s pattern
This problem isn't new. Database researchers solved a closely-related version of it 30 years ago for transactional systems where you needed both "what is true now" and "what did we think was true on Date X." The textbook is Richard Snodgrass's Developing Time-Oriented Database Applications in SQL (1999) — bi-temporal modelling, still readable, still applicable.
The core idea: every row in a memory table has two time dimensions:
- Valid time — when the fact was true in the world
- Transaction time — when we recorded it
Updating a fact doesn't overwrite — it inserts a new row with new valid-time bounds, and sets valid_until on the previous row. Queries can ask "current truth" (where valid_until IS NULL) or "what we believed at time T" (where valid_from <= T < COALESCE(valid_until, infinity)).
The git-SHA substitution
Wall-clock time is the wrong axis for a codebase. Code changes don't happen continuously — they happen at commits. The right "valid time" axis for an AI coding agent's memory is the git commit graph, not the wall clock.
So sverklo's memory schema substitutes valid_from and valid_until with valid_from_sha and valid_until_sha:
CREATE TABLE memories (
id INTEGER PRIMARY KEY,
category TEXT NOT NULL, -- decision, preference, pattern, ...
content TEXT NOT NULL,
tier TEXT NOT NULL, -- core, project, archive
kind TEXT, -- episodic, semantic, procedural
-- Bi-temporal columns
valid_from_sha TEXT NOT NULL, -- commit at which this memory was created
valid_until_sha TEXT, -- commit at which this memory was superseded; NULL = current
superseded_by INTEGER REFERENCES memories(id), -- pointer to the new memory
-- Provenance
created_at INTEGER NOT NULL,
last_accessed INTEGER NOT NULL,
access_count INTEGER NOT NULL DEFAULT 0,
confidence REAL NOT NULL DEFAULT 1.0,
-- Retrieval
embedding_id INTEGER REFERENCES memory_embeddings(rowid),
pins TEXT, -- JSON array of file paths / symbols
tags TEXT -- JSON array
);
CREATE INDEX idx_memories_current ON memories(valid_until_sha)
WHERE valid_until_sha IS NULL;
The partial index on valid_until_sha IS NULL makes "current truth" queries fast — most queries only want active memories.
How updates work
When the agent calls sverklo_remember and the memory already exists (matched by category + similar embedding), sverklo doesn't update the row. It:
- Inserts a new row with the new content,
valid_from_sha = HEAD - Sets
valid_until_sha = HEADon the old row - Sets
superseded_byon the old row to point at the new row's id
This preserves the lineage. sverklo_recall queries naturally filter to valid_until_sha IS NULL by default — but the --timeline flag opens the supersession history.
The query that justifies the design
The reason this matters is the query "what did we believe at commit abc123?" Conventional flat-overwrite memory can't answer this. Bi-temporal memory can:
-- Memories that were active at commit abc123:
-- (created at or before abc123, AND not superseded before abc123)
WITH commit_ancestry AS (
-- precomputed via `git rev-list --ancestry-path`
SELECT sha, depth FROM commit_graph WHERE 'abc123' IN (sha)
)
SELECT m.*
FROM memories m
WHERE m.valid_from_sha IN (SELECT sha FROM commit_ancestry WHERE depth <= 0)
AND (m.valid_until_sha IS NULL
OR m.valid_until_sha NOT IN (SELECT sha FROM commit_ancestry WHERE depth < 0));
It's messier in SQL than in pseudocode because git's commit graph is a DAG, not a line. The trick: precompute commit ancestry once as a closure table, store as commit_graph, then the query is straightforward set-membership.
For the auth-flow example earlier: at commit abc123 (when the JWT-in-middleware decision was made), the relevant memory says "JWT verification in middleware, reason: CSRF protection ordering." At commit def456 (after we revisited and changed our minds), the relevant memory says something different. sverklo_recall --at abc123 returns the first; sverklo_recall alone returns the second.
Why not just use git's own log?
Two reasons.
First, git's log is a record of code changes, not a record of beliefs about code. Many decisions are made without code changes ("we decided to migrate to Postgres next quarter, but we haven't started"). Git doesn't capture those. Bi-temporal memory does.
Second, git commit messages are unstructured prose. Searching them for "what did we decide about X?" is a needle-in-haystack problem that grep handles badly (which is what got us into this mess in the first place — see /bench/). Sverklo's memory layer is structured: each memory has a category, a confidence, a tier, embeddings for semantic search, and the lineage. Retrieval is hybrid (FTS5 + cosine over an ONNX embedding) and pinned to the git SHA where the question is being asked.
Lineage matters more than current truth
The pattern that's underrated in conventional memory systems: knowing why a previous decision was overturned is often more useful than knowing the current decision.
Concrete example: an engineer joins the team six months from now. They look at the auth middleware and see JWT verification in the route handler. Why? They run sverklo_recall --timeline auth and see:
2026-03-15 abc123 "JWT in middleware — CSRF ordering"
2026-04-22 def456 "JWT in route handler — middleware ran for SSE endpoints unnecessarily,
causing 12% latency overhead. CSRF ordering moved to a separate guard."
The current decision (JWT in route handler) makes sense only when paired with the previous decision and the reason for the change. Flat-overwrite memory deletes the why.
Constraints and tradeoffs
The DAG problem
Git is a DAG, not a chain. Branches diverge, merges create commits with multiple parents, cherry-picks copy commits across branches. "What did we believe at commit X?" requires walking the ancestry of X, which is multiple ancestor paths if there were merges.
Sverklo handles this by precomputing the ancestry closure for any commit you query — typically a small operation (<10ms even on repos with 100k commits) because git itself stores the ancestry compactly. The query plan is O(1) lookup against the precomputed table, not O(commits) per query.
Storage cost
Bi-temporal storage is more expensive than flat storage by a factor of ~2-5×, depending on how often memories supersede. In practice, sverklo's per-project memory database is a few megabytes for active projects — the storage cost is irrelevant compared to the query power.
Confidence decay
Memories don't get equally trustworthy as they age. A 6-month-old "we decided X" memory should be flagged for review when retrieved. Sverklo applies a decay function based on access patterns and commit-distance from HEAD; staleness shows up in the recall output as [STALE] for memories that haven't been accessed in 90+ days.
What this enables that you couldn't do before
- "Why did we change this?" Walk the supersession chain from current memory to its origin.
- "What did the team believe at commit X?" Single recall query parameterized by SHA.
- "Show me the team's evolving understanding of authentication." Filter by tag, return the timeline view.
- "What memories are stale and should be reviewed?" Filter by access pattern + commit distance.
- "This memory contradicts current code. Which is wrong?" Sverklo's stale-memory detector flags drift between memory content and the file's current state.
Borrowed from databases, applied to LLMs
Bi-temporal modelling isn't novel. The novelty is the application surface: AI coding agents have an ephemeral context window that compacts every few hours, and they need a memory layer that survives compaction and tracks belief change.
Most existing memory MCPs (the ones I've benchmarked, see comparison) are wrappers around an external vector DB. They treat memory as a flat searchable bag. The bi-temporal pattern adds a second axis (time), and pinning that axis to git SHAs (rather than wall-clock time) aligns with how engineering teams actually think about decisions.
It's an old idea applied to a new problem. The implementation lives in src/storage/memory-store.ts and src/memory/prune.ts — about 800 lines of TypeScript over SQLite.
Try it
Sverklo is MIT-licensed and runs on your laptop. Memory is local — no cloud, no external services, no API keys.
npm install -g sverklo cd your-project sverklo init
Memory tools available immediately: sverklo_remember, sverklo_recall, sverklo_memories, sverklo_pin, sverklo_promote, sverklo_demote. The agent calls them; you don't have to.
GitHub: sverklo/sverklo · 60-task retrieval benchmark · Comparisons
Cite this
@misc{sverklo_bench_primitives_2026,
title = {Sverklo bench:primitives — a 60-task retrieval evaluation for AI coding agents},
author = {Groshin, Nikita},
year = {2026},
doi = {10.5281/zenodo.19802051},
url = {https://sverklo.com/bench/}
}
References
- Snodgrass, R. T. (1999). Developing Time-Oriented Database Applications in SQL. Morgan Kaufmann.
- Sverklo memory implementation:
src/storage/memory-store.ts - Sverklo memory pruning:
src/memory/prune.ts