Press kit
Logos, screenshots, hero numbers, and contact for journalists, podcasters, and bloggers covering Sverklo.
One-line description
Sverklo is a local-first Model Context Protocol (MCP) server that gives AI coding agents — Claude Code, Cursor, Windsurf, Zed, Codex CLI — semantic understanding of a codebase, so the agent stops inventing function names that don't exist. MIT-licensed, 37 retrieval tools, runs entirely on a developer's laptop with no cloud, no API keys, no telemetry by default.
Hero numbers
Important: the bench also reports the slice where sverklo loses to grep on F1 (P5 dead-code detection, F1 = 0.02). The numbers above are the wins; the losses are on the same page.
Two-paragraph description
AI coding agents like Claude Code, Cursor, and Codex CLI hallucinate function names that don't exist in the user's codebase because they generate from training-data patterns rather than authoritative retrieval over the user's symbol graph. An agent will write getUserByEmail() when the codebase has findByEmail(); tests pass because the test mocks the dependency; production breaks. Sverklo solves this by exposing a 37-tool MCP retrieval layer the agent calls before writing code: sverklo_lookup resolves a name to its definition, sverklo_refs proves whether a symbol exists with caller context, sverklo_verify lets the agent re-check a quoted span against a cited git SHA.
The architecture combines BM25 lexical retrieval, ONNX-embedded vector retrieval, and PageRank computed on the file-import graph. The novel piece is channelized Reciprocal Rank Fusion: instead of running RRF once over fts ∪ vector, sverklo runs RRF per channel (FTS, vector, doc-section, path, symbol-name) and fuses with channel-specific weights. The path channel is weighted 1.5× because filename matches are precision-skewed. Sverklo's other surfaces include a bi-temporal memory layer pinned to git SHAs (with valid_until_sha and superseded_by lineage), a diff-aware risk-scored PR review tool, and a 60-task hand-verified retrieval benchmark with reproducible harness.
Founder + maintainer
Nikita Groshin — solo founder, full-time maintainer.
- Email: nikita@groshin.com
- GitHub: github.com/nike-17
- For interview requests, podcast bookings, or quote requests, reply within 24 hours by email.
Logos and brand assets
Brand colour: #E85A2A (orange). Background: #0E0D0B. Text: #EDE7D9.
Screenshots
Verifiable claims (every number is reproducible)
| Claim | How to verify |
|---|---|
| 37 MCP tools | List in README or call sverklo --tools |
| 11+ languages | SUPPORTED_LANGUAGES in src/types/index.ts |
| 62× fewer tokens than naive grep | bench:primitives, run npm run bench:primitives against repo |
| F1 = 0.58 on 60-task bench | /bench/ — same page reports F1 = 0.02 on dead-code slice |
| MIT licensed | LICENSE |
| Local-first / no cloud | Network calls inspected: only ONNX model download on first run, then offline. TELEMETRY.md |
| ~90 MB embedding model | all-MiniLM-L6-v2 ONNX, downloaded from HuggingFace, cached at ~/.sverklo/models/ |
| Bi-temporal memory schema | migration in src/storage/memory-store.ts |
Boilerplate quotes (use as-is in articles)
On the hallucination problem
"Claude Code writes getUserByEmail() when your code has findByEmail(). The root cause isn't Claude — it's that the agent generates from training-data patterns when it doesn't have authoritative retrieval against your actual symbol graph. Sverklo exposes that retrieval as 37 MCP tools the agent calls before writing code." — Nikita Groshin, maintainer
On benchmark honesty
"A tuned grep beats sverklo on F1 by 9 points. The point of publishing the result is the second-order finding: for AI agents inside bounded context windows, tokens-per-correct-answer is more load-bearing than F1, and on that axis sverklo wins by 62× over naive grep." — Nikita Groshin
On the local-first deployment model
"The whole reason this matters is privacy and operational simplicity. Sverklo runs on a laptop with embedded SQLite and a local ONNX model. No cloud, no API keys, no signup. For enterprises with compliance constraints that block cloud retrieval — finance, healthcare, defence — that's the load-bearing property, not the feature list." — Nikita Groshin
On open-source business model
"Sverklo is MIT-licensed and free. Anything in the open-source server today stays in the open-source server forever. A future Pro tier will add capabilities the OSS doesn't have — but the rule is 'Pro adds new things, never gates current things.' Open-core, not feature-locked open-source." — Nikita Groshin
Common questions journalists ask
Why "sverklo" — is that a real word?
It's the Russian word for "drill" (сверло), borrowed because the act of indexing a codebase is structurally what the tool does — drills into the imports, the symbols, the call graph, until the agent has authoritative answers. Pronounced like "SVERK-low".
Who's the company behind it?
Solo project, no VC, no investors. Maintained by Nikita Groshin. The plan is to land traction first and decide on company structure later — there's no monetisation pressure today.
Why MIT and not source-available?
Source-available licenses (Polyform, BSL, Elastic License) work for hosted services where the maintainer captures the cloud revenue. Sverklo is local-first; there's no cloud to monetize. MIT is the honest license for the deployment model.
How does it compare to Cursor's @codebase / Sourcegraph Cody / Greptile?
Detailed comparisons at sverklo.com/vs/ — including the slices where each tool wins. Headline: Cursor's @codebase is cloud + editor-bound; Cody is enterprise-priced source-available; Greptile is hosted PR-review. Sverklo is local-first MIT, MCP-native (works with every agent), and bundles retrieval + impact + review + memory in one install.
What's the business model?
Today: free, MIT, no monetisation. The intended path is open-core: Pro tier adds capabilities the OSS doesn't have (smart auto-capture, larger embedding models, hosted team memory). The OSS server stays MIT and stays full-featured for the use cases it covers today. No bait-and-switch.
Citation
@misc{sverklo_bench_primitives_2026,
title = {Sverklo bench:primitives — a 60-task retrieval evaluation for AI coding agents},
author = {Groshin, Nikita},
year = {2026},
doi = {10.5281/zenodo.19802051},
url = {https://sverklo.com/bench/}
}
If you're writing about sverklo for an academic publication, the citable artifact is the Zenodo deposit (DOI 10.5281/zenodo.19802051) and the bench-primitives evaluation.
Direct links
- Homepage: sverklo.com
- GitHub: github.com/sverklo/sverklo
- npm: npmjs.com/package/sverklo
- Bench: sverklo.com/bench/
- Comparisons: sverklo.com/vs/
- Playground: sverklo.com/playground/
- Reports: sverklo.com/report/
- Research paper: sverklo.com/research/
- Blog: sverklo.com/blog/
- DOI: 10.5281/zenodo.19802051
- llms.txt for AI engines: sverklo.com/llms.txt
For podcast bookings
Available topics:
- The hallucination problem in AI coding agents and the structural fix
- Designing reproducible benchmarks for AI tooling — methodology, honesty sections, "tokens per correct answer" as an evaluation axis
- Channelized Reciprocal Rank Fusion — what we learned from running 5 retrievers in parallel
- Bi-temporal memory pinned to git SHAs — borrowing from 1990s database literature for AI agent context
- The economics of local-first vs hosted developer tooling in the AI agent era
- What it actually takes to build a 37-tool MCP server alone
Booking: email nikita@groshin.com with show name, audience size estimate, and proposed timeframe. Response within 24 hours.