Reference · Config

.sverklo.yaml — project configuration

Drop a .sverklo.yaml in your repo root to tune how sverklo indexes and ranks your code. The file is optional — sverklo works out of the box without one. Use it when you need to exclude directories, boost certain paths, swap embedding providers, or fit a monorepo's shape.

Updated 2026-05-18 Schema v1 (sverklo ≥ 0.22.0)

Quick start: exclude directories in a monorepo

The most common need. Drop this in your repo root as .sverklo.yaml:

ignore:
  - "vendor/**"
  - "third_party/**"
  - "**/__generated__/**"
  - "fixtures/**"
  - "packages/legacy-app/**"

Then re-run sverklo audit to rebuild the index. Anything matching these globs is skipped at indexing time — never parsed, never embedded, never showing up in search.

Standard ignores (node_modules/, dist/, .git/, .sverklo/) are always applied. Your ignore entries are additive on top.

Full schema

# .sverklo.yaml (all keys optional)
ignore:
  - "vendor/**"          # glob (picomatch syntax)
  - "fixtures/**"

weights:
  - glob: "src/core/**"
    weight: 1.5          # 0.0–10.0, 1.0 = neutral
  - glob: "scripts/**"
    weight: 0.3          # downweight one-off scripts

search:
  defaultTokenBudget: 2000
  maxResults: 30
  budgets:
    search: 1500
    overview: 800

indexing:
  extensions:
    .pyi: "python"       # treat .pyi as python for parsing
    .erb: "html"

embeddings:
  provider: onnx         # onnx (default) | ollama
  model: bge-small-en    # provider-specific
  dimensions: 384        # auto-detected if omitted
  ollama:
    baseUrl: "http://localhost:11434"
    model: "nomic-embed-text"
  onnx:
    modelPath: "/abs/path/to/model.onnx"

Field reference

`ignore` — array of globs

Glob patterns (picomatch syntax) for paths to skip during indexing. Additive on top of sverklo's defaults. Examples:

"vendor/**" — skip everything under vendor/
"**/*.test.ts" — skip test files anywhere in the tree
"packages/legacy-*/**" — skip multiple sibling packages

`weights` — array of `{ glob, weight }`

Multipliers applied to PageRank importance for files matching each glob. Range is 0.0–10.0 (clamped). 1.0 is neutral. Use this to:

Boost domain code: {glob: "src/domain/**", weight: 2.0}
Downweight generated code: {glob: "**/generated/**", weight: 0.2}
De-emphasize scripts that look central but aren't: {glob: "scripts/**", weight: 0.3}

Resolution when multiple globs match: last match wins. Sverklo walks weights top-to-bottom and the final matching entry sets the file's weight (the others are discarded — weights are not multiplied or stacked). Put broader rules first and more specific overrides after:

weights:
  - glob: "tests/**"
    weight: 0.8        # all tests, downweight
  - glob: "tests/fixtures/**"
    weight: 0.5        # fixtures specifically, even more

A file at tests/fixtures/sample.json matches both entries; the effective weight is 0.5 (the last one). If you reverse the order, tests/fixtures/sample.json ends up at 0.8 because tests/** would be the last match. Order matters.

`search` — ranking budgets

Key	Type	Default	Effect
defaultTokenBudget	integer	2000	Cap on tokens returned by search before truncation
maxResults	integer	30	Hard cap on result count regardless of token budget
budgets	map<tool, int>	—	Per-tool overrides (e.g., `search: 1500`)

`indexing.extensions` — treat new extensions as a known language

Map a file extension to one of sverklo's supported parsers. Useful if your project uses non-standard extensions (e.g., .pyi as python).

`embeddings` — pick a provider

Default is the bundled ONNX provider (no setup needed). Switch to Ollama for larger models or custom dimensions:

embeddings:
  provider: ollama
  ollama:
    baseUrl: "http://localhost:11434"
    model: "nomic-embed-text"

ONNX vs Ollama: performance tradeoff

ONNX is materially faster than Ollama for indexing on most projects. The reason is structural, not a sverklo integration bug:

Aspect	ONNX (bundled)	Ollama
Where it runs	In-process (same Node.js)	Separate process over HTTP
Per-call overhead	~0 (direct function call)	HTTP roundtrip per batch
Model loading	Once at startup	Loaded by Ollama; unloads after ~5 min idle
Batching	Yes, internal	Yes, 128 inputs per HTTP request (Ollama 0.4+ `/api/embed`)
Indexing speed (200-file project)	seconds	tens of seconds to minutes (model-dependent)

Sverklo v0.23.1+ requests keep_alive: "10m" on every Ollama embedding call, so the model stays resident between batches. With that fix, the remaining gap is fundamental: in-process embeddings will always beat remote ones over HTTP.

When to pick which:

ONNX (default): pick this unless you have a specific reason to use Ollama. Faster, zero-setup, ships with sverklo.
Ollama: pick this if you want a model larger than the bundled one, or you want to share an embedding model across multiple local tools.

If you're hitting unexpected slowness on Ollama: ensure you're on sverklo ≥ v0.23.1 (which sends keep_alive), and check whether your chosen model fits in available VRAM — Ollama silently swaps to CPU if not, which can be 10–100× slower per request.

Recipes

Monorepo with one indexed package, rest excluded

ignore:
  - "packages/!(api)/**"
  - "apps/**"
  - "tools/**"

Boost recently-touched code; downweight legacy

weights:
  - glob: "src/**"
    weight: 1.2
  - glob: "legacy/**"
    weight: 0.2

Tight token budgets for small repos on slow machines

search:
  defaultTokenBudget: 1000
  maxResults: 10

Validation behavior

Sverklo loads .sverklo.yaml at startup and validates the structure. Invalid entries are logged and discarded; valid entries are kept. The server never crashes on a bad config — you'll see warnings on stderr like:

[config] .sverklo.yaml: invalid weight entry {…}, skipping
[config] .sverklo.yaml: weight for "vendor/**" clamped from -1 to 0.0

To verify your config is being picked up, run sverklo doctor — it lists the resolved config file path.