vs / jcodemunch-mcp

Sverklo vs jcodemunch-mcp

Both are MIT-licensed, local-first, MCP-shaped code-intelligence servers. Both appear on the same 180-task benchmark. Both shipped lodash P1 fixes inside 36 hours of the original bench publication. Different retrieval substrates, different wins. The honest version is below.

Side by side

Sverklo jcodemunch-mcp
License MIT MIT
Runtime Node 25, embedded SQLite, ONNX (~90 MB) Python (uvx-installed)
Install npm install -g sverklo uvx jcodemunch-mcp
Retrieval substrate Hybrid BM25 + ONNX embeddings + PageRank Tree-sitter symbol indexing + call-graph fallback
MCP tools shipped 37 (compact core profile by default) (see jcodemunch docs)
Bench F1 — overall 0.58 0.29
Bench F1 — P1 (def lookup) 0.63 0.52
Bench F1 — P2 (refs) 0.27 0.01
Bench F1 — P4 (file deps) 0.84 0.33
Bench F1 — P5 (dead code) 0.83 0.34
Avg input tokens / task 652 1,907
Avg tool calls / task 1.0 1.2
Memory (bi-temporal, git-pinned) Yes No
Diff-aware PR review Yes No
MCP clients supported Claude Code, Cursor, Windsurf, Zed, VS Code, JetBrains Claude Code, Cursor, Windsurf, Zed

Bench data from sverklo.com/bench — May 13, 2026 run, 180 tasks across 6 OSS codebases. Reproducible: npm run bench:quick.

The bench-loop story

2026-04-28 → 2026-05-04 · what happened

The original April bench had jcodemunch-mcp at P1 0.65 (definition lookup leader) but P5 0.00 on Express — the CommonJS module.exports = X re-export chain wasn't modeled as a use site, so the only export of the entire module appeared to have no callers.

Within hours of the bench going live, jcodemunch-mcp's maintainer @jgravelle shipped v1.80.7 / 1.80.8 / 1.80.9 in 36 hours. The fixes: CommonJS re-export modeling, a 500 KB per-file size cap (lodash.js is 548 KB), and a monolithic-IIFE call-graph fallback. P5 went 0.00 → 1.00; lodash P1 went 0/10 → 9/10.

Adding lodash to sverklo's bench then exposed the symmetric blind spot in sverklo's own parser. The regex brace counter mis-counted braces inside string literals — line 6301 of lodash.js has '{\n/* [wrapped with ' inside a string, and the unbalanced { made every function declaration after that line get absorbed into one ~11 K-line chunk. Sverklo v0.20.2 ships the fix; P1 went 0.30 → 0.73, overall F1 went 0.45 → 0.56.

Both projects landed lodash P1 fixes inside 36 hours of the original bench publication. Different parsers, different bugs, same effect — and the public benchmark made each side's blind spot visible to the other in a way no internal eval would have. The full timeline lives on sverklo.com/bench.

When to use which

Choose Sverklo

  • You want the overall F1 leader on the published benchmark (0.58 vs 0.29) at lower input tokens (652 vs 1,907).
  • File-dependency analysis matters to your workflow — sverklo wins P4 at 0.84 vs jcodemunch's 0.33.
  • You want bi-temporal memory pinned to git SHAs (jcodemunch has no memory layer).
  • Your repo has TypeScript / TSX-heavy code where the regex parser plus tree-sitter fallback both matter.
  • You want diff-aware PR review (sverklo review) alongside retrieval.
  • Node ecosystem fits your install constraint better than uvx / Python.

Choose jcodemunch-mcp

  • You care most about P5 recall and can tolerate false-positive density — jcodemunch reaches P5 recall 1.00 with F1 0.34.
  • Your stack is Python-native and uvx is the install path you already use.
  • You want the most active single-purpose MCP tool in this category — jcodemunch shipped three releases in 36 hours against bench findings, which is unusually responsive.
  • Tree-sitter call-graph is the substrate you trust over hybrid BM25 + embeddings.

Where neither wins

P2 reference-finding remains the weak slice across the field: sverklo 0.27, smart-grep 0.20, jcodemunch 0.01, GitNexus 0.00 on the 180-task run. The bench page's honesty section calls this out at the per-task level. Different retrieval substrates, different jobs.

Try Sverklo

MIT-licensed. Local-first. 37 MCP tools. Single install command.

click to copy