/ Directory / Playground / dspy-skills
● Community OmidZamani ⚡ Instant

dspy-skills

by OmidZamani · OmidZamani/dspy-skills

Claude skills pack for DSPy — program language models, optimize prompts, build RAG pipelines systematically.

dspy-skills teaches Claude the DSPy mental model: signatures, modules, predictors, teleprompters, and evaluation loops. Instead of hand-crafting prompts, you describe the task via signatures and let DSPy's optimizers do the work — and Claude writes the DSPy code idiomatically rather than resorting to raw prompt templates.

Why use it

Key features

Live Demo

What it looks like in practice

dspy-skill.replay ▶ ready
0/0

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "dspy-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/OmidZamani/dspy-skills",
        "~/.claude/skills/dspy-skills"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "dspy-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/OmidZamani/dspy-skills",
        "~/.claude/skills/dspy-skills"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "dspy-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/OmidZamani/dspy-skills",
        "~/.claude/skills/dspy-skills"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "dspy-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/OmidZamani/dspy-skills",
        "~/.claude/skills/dspy-skills"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "dspy-skill",
      "command": "git",
      "args": [
        "clone",
        "https://github.com/OmidZamani/dspy-skills",
        "~/.claude/skills/dspy-skills"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "dspy-skill": {
      "command": {
        "path": "git",
        "args": [
          "clone",
          "https://github.com/OmidZamani/dspy-skills",
          "~/.claude/skills/dspy-skills"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add dspy-skill -- git clone https://github.com/OmidZamani/dspy-skills ~/.claude/skills/dspy-skills

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use dspy-skills

How to build your first DSPy program and optimize it

👤 ML engineers and applied researchers ⏱ ~90 min advanced

When to use: You have a task where prompt quality matters and want a systematic way to improve it.

Prerequisites
  • Python 3.10+ with dspy-ai installed — pip install dspy-ai
  • Skill cloned — git clone https://github.com/OmidZamani/dspy-skills ~/.claude/skills/dspy-skills
Flow
  1. Define the signature
    Task: classify support tickets into {billing, technical, account}. Give me a DSPy signature and a simple Predict module.✓ Copied
    → Signature + module code
  2. Write an eval
    Add an evaluation set of 50 labeled examples and an accuracy metric.✓ Copied
    → Eval harness with metric callable
  3. Optimize
    Run BootstrapFewShot to compile the module against the eval set.✓ Copied
    → Compiled predictor + improved score

Outcome: A DSPy-optimized predictor that beats a hand-written prompt, with reproducible code.

Pitfalls
  • Eval too small — optimizer overfits — Minimum 100–200 examples; hold out a true test set
  • Metric doesn't capture what you care about — Spend on metric design before on model choice
Combine with: filesystem

Build a RAG pipeline with DSPy

👤 Engineers building retrieval-augmented systems ⏱ ~120 min advanced

When to use: You want modular, optimizable RAG rather than a hand-wired chain.

Flow
  1. Define the modules
    Create a DSPy RAG pipeline: RetrieveThenRead with ColBERTv2 or a local retriever.✓ Copied
    → Modular pipeline with separate retrieval and generation
  2. Optimize end-to-end
    Write an eval on our QA set and run MIPRO to improve.✓ Copied
    → Compiled pipeline with score delta

Outcome: A RAG pipeline you can improve by changing the eval, not by rewriting prompts.

Pitfalls
  • Retriever quality caps end-to-end quality — Evaluate retrieval separately (recall@k) before optimizing generation
Combine with: local-rag

Combinations

Pair with other MCPs for X10 leverage

dspy-skill + local-rag

Plug a local retriever into DSPy's RAG modules

Swap the ColBERT retriever for a local-rag MCP as the retrieval source.✓ Copied
dspy-skill + filesystem

Organize DSPy programs, evals, and artifacts in a repo

Lay out a DSPy project with programs/, evals/, and artifacts/ directories.✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
signature-design task spec Start of any DSPy program 0
module-authoring signatures + flow After signatures 0
teleprompter-optimization module + eval After eval is ready LLM tokens during optimization
evaluation-harness task data Before optimizing 0

Cost & Limits

What this costs to run

API quota
LLM tokens dominate during optimization runs
Tokens per call
Can be high — optimizations may invoke the LLM hundreds of times
Monetary
depends on provider
Tip
Use cheap models during teleprompter runs, upgrade only for final evaluation

Security

Permissions, secrets, blast radius

Credential storage: LLM provider keys in env vars
Data egress: LLM provider endpoints

Troubleshooting

Common errors and fixes

Teleprompter seems to make things worse

Check metric correctness; use a held-out set; widen the example pool.

Optimization eats your budget

Cap max_bootstrapped_demos and use cheaper models during search.

Alternatives

dspy-skills vs others

AlternativeWhen to use it insteadTradeoff
prompt-architect-skillYou want prompt-level craft, not DSPy's programmatic approachHand-crafted vs optimized

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills