mcp-client-for-ollama — Install & Live Demo

Why use it

Key features

Agent mode with configurable iteration limits for multi-step tool execution
Multi-server support: connect to multiple MCP servers simultaneously (STDIO, SSE, HTTP)
Human-in-the-loop: review and approve each tool call before execution
Model switching between any Ollama model mid-conversation
Thinking mode for visible chain-of-thought reasoning
Performance metrics: token counts, generation speed, timing data
Auto-discovery of existing Claude Desktop MCP configs

Live Demo

What it looks like in practice

client-for-ollama.replay ▶ ready

0/0

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "client-for-ollama": {
      "command": "uvx",
      "args": [
        "mcp-client-for-ollama"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "client-for-ollama": {
      "command": "uvx",
      "args": [
        "mcp-client-for-ollama"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "client-for-ollama": {
      "command": "uvx",
      "args": [
        "mcp-client-for-ollama"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "client-for-ollama": {
      "command": "uvx",
      "args": [
        "mcp-client-for-ollama"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "client-for-ollama",
      "command": "uvx",
      "args": [
        "mcp-client-for-ollama"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json

{
  "context_servers": {
    "client-for-ollama": {
      "command": {
        "path": "uvx",
        "args": [
          "mcp-client-for-ollama"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add client-for-ollama -- uvx mcp-client-for-ollama

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use mcp-client-for-ollama

How to use MCP tools with local LLMs for free

👤 Developers who want MCP functionality without cloud API costs ⏱ ~15 min beginner

When to use: You have MCP servers configured but want to use them with a local model instead of Claude or GPT.

Prerequisites

Ollama installed and running — ollama.com — install, then ollama pull llama3.2:3b
ollmcp installed — pip install ollmcp

Flow

Launch with auto-discovery

ollmcp --auto-discovery --model llama3.2:3b✓ Copied

→ TUI launches, shows discovered MCP servers from Claude config
Test a tool call

List the files in my current directory.✓ Copied

→ Model calls the filesystem MCP tool and returns results
Enable agent mode for multi-step tasks

Type /agent to enable agent mode, then: 'Find all TODO comments in this project and summarize them.'✓ Copied

→ Model iterates: searches files, reads matches, produces summary

Outcome: Working MCP tool-use powered entirely by a local model — zero API cost.

Pitfalls

Small models (3B) struggle with complex tool-use chains — Use 7B+ models for agent mode; 3B is fine for single tool calls
Model calls wrong tool or wrong params — Enable human-in-the-loop (/hil) to catch and correct bad tool calls

Combine with: filesystem

Connect to multiple MCP servers for a cross-tool workflow

👤 Power users running several MCP servers ⏱ ~20 min intermediate

When to use: You want a local LLM to orchestrate across filesystem, GitHub, and other MCP servers in one session.

Prerequisites

MCP server config JSON — Create a JSON file with all your server definitions (same format as Claude Desktop config)

Flow

Launch with config

ollmcp --servers-json ~/mcp-servers.json --model qwen2.5:7b✓ Copied

→ All servers connected, tools listed
Use /tools to manage

Type /tools to see all available tools across servers. Disable any you don't need.✓ Copied

→ Tool list with enable/disable toggles
Run a cross-server task

Read my project's README.md, then search GitHub for similar projects and compare their approaches.✓ Copied

→ Model chains filesystem read + GitHub search

Outcome: A local LLM orchestrating tools from multiple MCP servers in one conversation.

Pitfalls

Too many tools confuse smaller models — Disable unused tools with /tools — fewer tools means better tool selection

Combine with: github

Use human-in-the-loop to safely test MCP servers

👤 Developers building or testing new MCP servers ⏱ ~15 min beginner

When to use: You're developing an MCP server and want to test tool calls with approval before execution.

Flow

Enable HIL mode

ollmcp -s ./my-mcp-server.py --model llama3.2:3b then type /hil to enable✓ Copied

→ Human-in-the-loop enabled
Trigger tool calls

Ask the model to use your server's tools. Each call pauses for your approval.✓ Copied

→ Tool call preview shown, waiting for y/n
Review and iterate

Reject bad calls, approve good ones. Check /show-metrics for timing.✓ Copied

→ Approved calls execute; rejected ones prompt retry

Outcome: A safe testing loop for MCP server development with full visibility into every tool call.

Combinations

Pair with other MCPs for X10 leverage

client-for-ollama + filesystem

Local Ollama model reads and edits files via filesystem MCP — fully offline coding assistant

ollmcp --auto-discovery --model qwen2.5:7b then: 'Read src/main.py and add error handling to the parse function.'✓ Copied

client-for-ollama + github

Use a local model to search GitHub repos and create issues without cloud API costs

Connect GitHub MCP to ollmcp, then: 'Search our org for repos with no CI config and create tracking issues.'✓ Copied

Cost & Limits

What this costs to run

API quota: No external API quota — runs on local Ollama. MCP server limits apply per-server.
Tokens per call: Depends on Ollama model context window (typically 2k-128k)
Monetary: Completely free. Ollama is free. ollmcp is MIT-licensed. You only pay for electricity.
Tip: This is the zero-cost option for MCP tool-use. Use smaller models (3B) for simple tasks, larger (7B+) for agent mode.

Security

Permissions, secrets, blast radius

Credential storage: MCP server credentials configured in the servers JSON file or environment variables — same as Claude Desktop config

Data egress: Ollama runs locally. MCP servers egress to their own backends. No data leaves your machine by default unless an MCP server sends it.

Agent mode can execute multiple tool calls in a loop. Use human-in-the-loop (/hil) when running with write-capable MCP servers.
Ollama Cloud models route through external servers — use local models for sensitive data.

Troubleshooting

Common errors and fixes

Connection refused to Ollama

Ensure Ollama is running: ollama serve. Default port is 11434. Use --host to specify a custom host if needed.

Verify: curl http://localhost:11434/api/tags

Model doesn't call tools / ignores MCP

Not all models support tool-use well. Use qwen2.5:7b or llama3.1:8b which have good tool-use training. Avoid tiny models for complex chains.

Verify: Test with a simple single-tool prompt first

MCP server fails to connect

Check your servers JSON config. Ensure the command path is absolute and the server binary/script exists. Run the server command manually to see errors.

Verify: Run the MCP server command directly in a terminal

Auto-discovery finds no servers

ollmcp looks for Claude Desktop config at the standard path. On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Verify: ls ~/Library/Application\ Support/Claude/claude_desktop_config.json

Alternatives

mcp-client-for-ollama vs others

Alternative	When to use it instead	Tradeoff
Claude Desktop	You want the best MCP experience and are willing to pay for Claude API	Cloud-based, costs money, but much better tool-use than local models
oterm	You want an Ollama TUI without MCP focus	No MCP integration; just chat

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills