/ Directory / Playground / swarmvault
● Community swarmclawai ⚡ Instant

swarmvault

by swarmclawai · swarmclawai/swarmvault

Turn raw research — PDFs, transcripts, code, audio — into a local markdown wiki + knowledge graph your AI can query forever.

SwarmVault compiles mixed inputs (30+ formats) into an Obsidian-compatible wiki with a typed knowledge graph and hybrid search. Exposes an MCP server for Claude Code, Codex, OpenCode, and others. Every edge is tagged extracted/inferred/ambiguous for provenance. Local-first with an offline heuristic provider (no API keys needed).

Why use it

Key features

Live Demo

What it looks like in practice

swarmvault.replay ▶ ready
0/0

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "swarmvault": {
      "command": "npx",
      "args": [
        "-y",
        "swarmvault"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "swarmvault": {
      "command": "npx",
      "args": [
        "-y",
        "swarmvault"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "swarmvault": {
      "command": "npx",
      "args": [
        "-y",
        "swarmvault"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "swarmvault": {
      "command": "npx",
      "args": [
        "-y",
        "swarmvault"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "swarmvault",
      "command": "npx",
      "args": [
        "-y",
        "swarmvault"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "swarmvault": {
      "command": {
        "path": "npx",
        "args": [
          "-y",
          "swarmvault"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add swarmvault -- npx -y swarmvault

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use swarmvault

Compound research across months into a queryable wiki

👤 Researchers, analysts doing deep work over time ⏱ ~60 min intermediate

When to use: You've been researching a topic for weeks and realize your notes are scattered.

Prerequisites
  • swarmvault CLI — npm install -g @swarmvaultai/cli
  • A vault — swarmvault init --obsidian --profile personal-research
Flow
  1. Dump raw sources
    Drop everything into raw/: PDFs, saved articles, meeting transcripts, code snippets.✓ Copied
    → Immutable raw/ folder populated
  2. Compile the wiki
    swarmvault compile — generate wiki/ with typed pages and link frontmatter.✓ Copied
    → Wiki pages created
  3. Query through MCP
    Via the MCP: what's the current consensus in my notes about 'post-quantum TLS timeline'? Include contradictions.✓ Copied
    → Answer with graph-walked citations and flagged contradictions

Outcome: A growing, queryable research base that gets smarter with every addition.

Pitfalls
  • Noisy inputs produce noisy graph — Curate raw/ — delete junk instead of ingesting everything
  • Over-relying on the graph for 'truth' — Edges tagged 'inferred' or 'ambiguous' are hypotheses, not facts — respect the tags
Combine with: documentation-server

Turn meeting transcripts into structured team knowledge

👤 Teams doing a lot of recorded meetings ⏱ ~45 min intermediate

When to use: You have Fathom/Otter transcripts piling up and want to query across them.

Flow
  1. Drop transcripts in raw/
    Copy all my Otter transcripts from last quarter into raw/meetings/.✓ Copied
    → Files in place
  2. Compile
    swarmvault compile. Confirm entities (people, projects) were extracted.✓ Copied
    → Entity count report
  3. Query
    What did we decide about the pricing overhaul across all Q1 meetings? Cite each source.✓ Copied
    → Consolidated answer with source transcripts cited

Outcome: Institutional memory that survives people leaving.

Build a book-club wiki with cross-book reasoning

👤 Serious readers ⏱ ~90 min intermediate

When to use: You want the AI to reason across books, not just one at a time.

Flow
  1. Ingest
    Drop PDFs + notes for 5 books into raw/; compile.✓ Copied
    → Wiki with one page per book + concept pages
  2. Cross-reason
    Which concepts show up in 3+ books? Identify contradictions between authors.✓ Copied
    → Concept map with contradictions flagged

Outcome: A reading practice that compounds.

Combinations

Pair with other MCPs for X10 leverage

swarmvault for structured wiki; documentation-server for quick-ingest search

Ingest new papers into both — compare retrieval quality and structure.✓ Copied
swarmvault + logseq

Export vault pages into a Logseq graph

Export wiki/ Markdown into my Logseq pages/ dir with frontmatter preserved.✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
vault_search query: str, top_k?: int Main retrieval free (local)
vault_get_page slug: str Full page read free
vault_graph_neighbors slug, depth? Explore concept graph free
vault_contradictions topic?: str Quality check free
vault_compile (none) After adding new raw/ CPU + optional embedding calls
vault_graph_report scope?: str Overview of a topic free

Cost & Limits

What this costs to run

API quota
Zero with the heuristic provider; otherwise your LLM provider's limits
Tokens per call
Compile can stream large source content through embeddings; search queries 500-2000 tokens typical
Monetary
Free; optional LLM providers cost normal LLM fees
Tip
Use heuristic provider for first-pass compile; upgrade to LLM-backed compile only when quality matters.

Security

Permissions, secrets, blast radius

Credential storage: Optional LLM provider keys in env/config
Data egress: None by default (heuristic); opt-in provider calls only

Troubleshooting

Common errors and fixes

Compile fails on a specific PDF

Some PDFs are scanned/image-only. Pre-OCR with ocrmypdf before dropping in raw/.

Verify: pdftotext file.pdf - | head
MCP connection fails after agent install

Re-run swarmvault install --agent <name> and restart the agent; config paths vary per platform.

Verify: Client MCP list shows swarmvault
Hybrid search returns nothing

Vault not compiled yet, or embeddings not built for offline mode. Run swarmvault compile.

Verify: swarmvault status

Alternatives

swarmvault vs others

AlternativeWhen to use it insteadTradeoff
documentation-serverYou want quick drag-and-drop RAG without wiki compilationNo structured graph or contradiction detection
Obsidian + Smart ConnectionsYou live in Obsidian and want in-editor AILess structured compile pipeline
NotebookLMYou're OK with Google-hosted, easy UIData leaves your machine; no graph export

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills