Question 1

What is smart-grep, and how does it differ from plain grep?

Accepted Answer

Smart-grep is the strong text-search baseline in sverklo-bench. It wraps ripgrep with three pieces of intelligence: language-aware file filters (extension globs), def-shaped patterns per task type (function/class/def regex with word boundaries), and ±10 lines of context around each hit. The implementation lives in benchmark/src/baselines/smart-grep.ts and is the best version of 'grep + a thoughtful prompt' — not a strawman. It exists so 'sverklo beats grep' means sverklo beats grep at its strongest, not grep at its dumbest.

Question 2

How does Sverklo compare to smart-grep on the benchmark?

Accepted Answer

On 180 hand-verified tasks across 6 codebases (express 4.21.1, lodash 4.17.21, sverklo, requests 2.32.3, flask 3.0.3, fastapi 0.115.0): Sverklo F1 0.58 vs smart-grep 0.34. The largest gaps are P1 (def-lookup) 0.63 vs 0.20 and P4 (file dependencies) 0.84 vs 0.40 — both task types where flat text search cannot reconstruct cross-file relationships. P2 (ref-finding) 0.27 vs 0.20. P5 (dead-code detection) ties at 0.83 vs 0.83 — the Python-decorator audit fix shipped in v0.20.19 closed the prior gap.

Question 3

Are sverklo's token savings over smart-grep meaningful?

Accepted Answer

Average tokens per task: sverklo 652, smart-grep 714 — about a 9% advantage. Not the headline. The headline is the 24-point F1 gap at the same token budget. The interesting effect compounds across a session: when smart-grep misses, the agent re-queries (2-3 follow-up calls), spending its actual token win and more. Sverklo's first answer is more often the right answer.

Question 4

Why does the F1 gap widen at scale?

Accepted Answer

At 90 tasks (the prior bench run) the gap was 7 F1 points. At 180 tasks it's 22 points. Two reasons: (a) the 180-task set adds more multi-file refactor and dependency-graph tasks, which grep cannot answer regardless of pattern quality; (b) the larger codebases (fastapi, requests) have deeper import graphs where 'find every caller across modules' is the question. Smart-grep's accuracy is per-file local; sverklo's is repo-global via the import graph and PageRank centrality.

Question 5

When should I prefer smart-grep over sverklo?

Accepted Answer

Three honest cases: (1) small repos where grep + a few language filters already finds everything in one pass, (2) zero-tolerance for install steps — smart-grep is a wrapper around the ripgrep binary you may already have, (3) zero cold-start budget at all (smart-grep has no warm state to maintain). For larger repos, cross-file refactors, dependency-graph questions, or anything where you want memory across sessions, sverklo's hybrid retrieval is measurably more accurate.

Question 6

Is the smart-grep baseline tuned fairly?

Accepted Answer

The baseline implementation is in github.com/sverklo/sverklo-bench/blob/main/src/baselines/smart-grep.ts and is open for review. It uses tuned regex patterns per task type (def-lookup gets language-appropriate definition shapes, refs get word-boundary identifiers, etc.) and language-aware file filters. PRs improving the smart-grep baseline are welcome — if a better-tuned grep wrapper closes the gap, that's a legitimate result worth publishing.

Metric	Sverklo	Smart-grep	Gap
Overall F1	0.58	0.34	+24 pts (was +7 at 90-task)
P1 definition lookup	0.63	0.20	+43 pts
P2 reference finding	0.27	0.20	+7 pts
P4 file dependencies	0.84	0.40	+44 pts
P5 dead-code detection	0.83	0.83	tie (v0.20.19 fix landed)
Input tokens / task	652	714	−62 tokens (~9% sverklo win)
Tool calls / task	1.0	3.2	3.2× fewer
Wall time / task (warm)	64 ms	1,130 ms	18× faster
Cold-start cost	~26 s (index build, M-series)	0	Smart-grep wins on cold-start

	Sverklo	Smart-grep
Implementation	MCP server + tree-sitter + ONNX embeddings + PageRank + SQLite	Shell baseline: `rg` with language filters, definition-shaped patterns, ±10-line context reads
Setup	`npm i -g sverklo && sverklo init` (one-time index build)	Install ripgrep; nothing else
Cold-start cost	~180 s on a ~4000-file repo (index build)	None
Warm wall time	~68 ms per task	~1,130 ms per task (multi-grep cascade)
Cross-file graph awareness	Yes — symbol graph, PageRank-ranked retrieval, file deps	No — each grep call is stateless
Memory across sessions	Bi-temporal SHA-pinned memory	None
Languages	10 first-class (TS/JS, Python, Go, Rust, C#, Vue, Markdown, Jupyter, etc.)	Any language grep matches (all of them); no symbol-level semantics
License	MIT	BSD-3 (ripgrep); the smart-grep wrapper used here is shipped in `benchmark/src/baselines/smart-grep.ts` MIT

Sverklo vs Smart-Grep at Scale

Bench numbers — 180-task / 6-codebase run (2026-05-13)

Feature comparison

When to use each

Choose Sverklo when

Choose smart-grep when

How smart-grep works internally

Try Sverklo