Updates

Reverse-chronological feed of what shipped in sverklo. Releases, blog posts, bench refreshes, methodology fixes, and the negative results we publish alongside the wins. Updated weekly.

2026-05-09

2026-05-09 fix v0.20.6

Audit JSON gets structured fields (format 1.0.0)

sverklo audit --format json now emits structured grade, numeric_score, and dimensions: [{name, grade, score, detail}] directly. The 0.4.0 schema only had a markdown content blob; consumers had to parse the headline + table to extract grades. Discovered while writing the dogfood workflow when the published GitHub Action's PR-comment builder was posting "Overall: ?" because of the missing structured fields. content stays for backwards compat.

commit 65806c0·bin/sverklo.ts

2026-05-09 distrib

Sverklo Audit GitHub Action published to Marketplace

Drop one line into your workflow and get a graded PR comment without uploading code: - uses: sverklo/sverklo@main. Local-first by design — the audit runs on your own GitHub Actions runner; no SaaS round-trip. Listing went live this afternoon.

marketplace listing·action.yml

2026-05-09 ship

Self-audit dogfood workflow on every PR

New .github/workflows/audit-self.yml runs sverklo's own audit against the local PR build (not the published npm version) and posts a sticky comment with the grade. The marketplace Action installs from npm, so PRs that change audit logic would grade themselves with stale code; the dogfood workflow uses the freshly-built binary. First PR opened against main gets the first comment.

commit 08405e4

2026-05-08

2026-05-08 post

We Already Shipped MCP Code Mode — Sverklo's Tool Surface, Measured

Q2 2026's MCP discourse landed on tool-list bloat. Cloudflare cut a 1.17M-token spec to ~1K with Code Mode; Anthropic shipped Tool Search lazy-load. Sverklo's SVERKLO_PROFILE env var has implemented the same idea for months — first time we measured it publicly: 8,016 → 1,522 tokens, 81% reduction with one env var. Per-profile breakdown table.

2026-05-08 ship

Hub schemas on /vs/ and /blog/ index

Added ItemList JSON-LD enumerating all 13 comparison pages and all 14 blog posts. Pages were SEO orphans — child pages had no canonical hub feeding internal PageRank. Compounding lift expected across every /vs/* and /blog/* URL.

vs hub·blog hub

2026-05-08 ship

Drop-in subagent for Claude Code users

agents/sverklo-explore.md ships as a curl-installable replacement for Claude Code's built-in Explore subagent. The default uses Read + Grep cascade (~14,200 tokens to find one function); this version uses sverklo's typed MCP tools (~150-800 tokens, single tool call). Tools-per-task on the bench: sverklo 1.0 vs naive grep 6.1.

subagent definition·agents/ directory

2026-05-07

2026-05-07 bench

Public MCP code-intel ranking page launched

5 baselines (sverklo, smart-grep, jcodemunch-mcp, naive-grep, gitnexus) compared on 120 hand-verified retrieval tasks across 4 OSS codebases (express, lodash, requests, sverklo). Sverklo F1=0.60 overall leader; jcodemunch wins P1 def-lookup outright at 0.78 and we publish that loss visibly. The methodology repo at sverklo-bench accepts new baseline submissions via PR.

2026-05-07 fix v0.20.3

Cascade bug — dependency-graph data integrity (sv-p4-04)

FileStore.upsert was using INSERT OR REPLACE, which on conflict deleted the row before re-inserting — triggering ON DELETE CASCADE on every dependency edge involving that file (both as source and target). buildGraph only restored outgoing edges, so incoming edges from cached source files were silently lost on every modification. Fix: INSERT … ON CONFLICT(path) DO UPDATE. Migration v8→v10 repairs corrupted DBs. Bench impact: sverklo P4 0.51 → 0.72.

issue thread·commit b3458c5

2026-05-05

2026-05-05 post

Late-interaction rerank made our F1 worse, not better

Negative-result writeup. Wired a poor-man's late-interaction reranker into sverklo_lookup and sverklo_refs, ran A/B against the bench three times deterministically, F1 dropped from 0.5847 to 0.5551 (-3pp; -7.5pp on P1). Diagnosis: SQL match-quality (exact > prefix > substring) is already optimal for symbol-name queries; semantic alignment dilutes the signal instead of sharpening it. Promotion gate published for the next ColBERT v2 attempt.

2026-05-03

2026-05-03 post

Why Claude Code Burns So Many Tokens — Field Study

Instrumented a week of Claude Code sessions across 312 tasks. Grep accounts for 41% of input-token spend. Sessions with grep results >8K tokens hallucinate 31% of the time vs 4% under 2K. The fix is hybrid retrieval exposed as MCP tools — measurable ~60% token reduction.

2026-05-03 post

Best MCP Servers for Code Intelligence — 12-Option Comparison

The honest "best of" landscape doc. 12 MCP servers compared on license, hosting, language coverage, tool count, and retrieval substrate. Includes sverklo's own gaps. Updated to current bench numbers.

Why this page exists: sverklo ships a lot. Releases hit npm, blog posts land at /blog/, methodology issues track on GitHub, the bench refreshes on demand. None of those individually map "what changed in sverklo this week" to a single surface. This page does. RSS / Atom feed coming soon; for now the canonical surface is the markdown above.

Update cadence: at minimum weekly, sometimes daily during active iteration. The git log is the always-current source if you want raw signal.