opik-mcp — Установка & Живое демо

Зачем использовать

Ключевые функции

Official, maintained by Comet
Prompt lifecycle: list, get, version, promote
Workspace / project / trace exploration
Dataset + metric operations for evals
Cloud (comet.com) or self-hosted; HTTP bearer auth supported

Живое демо

Как выглядит на практике

opik.replay ▶ готово

0/0

Установка

Выберите клиент

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "opik": {
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ],
      "_inferred": true
    }
  }
}

Откройте Claude Desktop → Settings → Developer → Edit Config. Перезапустите после сохранения.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "opik": {
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ],
      "_inferred": true
    }
  }
}

Cursor использует ту же схему mcpServers, что и Claude Desktop. Конфиг проекта приоритетнее глобального.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "opik": {
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ],
      "_inferred": true
    }
  }
}

Щёлкните значок MCP Servers на боковой панели Cline, затем "Edit Configuration".

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "opik": {
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ],
      "_inferred": true
    }
  }
}

Тот же формат, что и Claude Desktop. Перезапустите Windsurf для применения.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "opik",
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ]
    }
  ]
}

Continue использует массив объектов серверов, а не map.

~/.config/zed/settings.json

{
  "context_servers": {
    "opik": {
      "command": {
        "path": "npx",
        "args": [
          "-y",
          "opik-mcp"
        ]
      }
    }
  }
}

Добавьте в context_servers. Zed перезагружается автоматически.

claude mcp add opik -- npx -y opik-mcp

Однострочная команда. Проверить: claude mcp list. Удалить: claude mcp remove.

Сценарии использования

Реальные сценарии: opik-mcp

Pull a production trace into your IDE to debug a bad LLM response

👤 LLM app developers ⏱ ~15 min intermediate

Когда использовать: A user reports a wrong answer; the trace is in Opik; you want to inspect it without leaving Cursor.

Предварительные требования

Opik API key — comet.com/site > API Keys (or self-hosted admin)

Поток

Find the trace

Search traces in project 'prod-chatbot' where output contains 'I cannot help with that'. Last 24h.✓ Скопировано

→ Matching trace IDs + timestamps
Inspect

Open trace ID abc123. Show me the full message chain, tools called, and intermediate reasoning.✓ Скопировано

→ Full trace object
Form hypothesis

Why might the model have refused? Compare this trace to a successful one on the same prompt template.✓ Скопировано

→ Diff + hypothesis

Итог: Faster trace-driven debugging without app-switching.

Подводные камни

PII in traces — Configure Opik's redaction before enabling MCP access broadly

Iterate on a prompt template with version tracking

👤 Prompt engineers ⏱ ~25 min advanced

Когда использовать: You're tuning a system prompt and want each version saved to Opik for rollback.

Поток

Pull current version

Get latest version of prompt 'support-agent-system'.✓ Скопировано

→ Current prompt body
Edit and commit

Propose a change to handle escalations better. Show diff. Commit as a new version with message 'add escalation path'.✓ Скопировано

→ Diff + new version ID
Eval against dataset

Run this new version against dataset 'support-eval-v1'. Compare pass rate vs previous version.✓ Скопировано

→ Metric comparison

Итог: Data-driven prompt changes, version-controlled.

Подводные камни

No guardrails — a regressive prompt becomes prod — Use Opik's experiment framework: don't promote until pass rate ≥ baseline

Generate a weekly LLM app health report

👤 Eng leads, LLM app PMs ⏱ ~30 min intermediate

Когда использовать: You want a Monday-morning digest of cost, latency, error rate, and top failure categories.

Поток

Pull last week's metrics

For project 'prod-chatbot': total traces, total tokens, avg latency p50/p95, error count — over last 7 days.✓ Скопировано

→ Metrics block
Classify failures

Sample 20 failed traces. Cluster by failure mode. Rank clusters by frequency.✓ Скопировано

→ Failure taxonomy
Write the digest

Compose a Markdown digest with the metrics and top 3 failure modes, ready for Slack.✓ Скопировано

→ Shareable report

Итог: Weekly LLM ops awareness without manual dashboard time.

Подводные камни

Metric drift as your app evolves — Version the report template; compare apples to apples week over week

Сочетать с: notion

Комбинации

Сочетайте с другими MCP — эффект x10

opik + github

When a prompt regresses, open a GitHub issue with the failing trace

If pass rate drops >5% on 'support-eval-v1' vs last week, create a GitHub issue with the top 3 failing trace IDs.✓ Скопировано

opik + notion

Publish weekly LLM health digest to Notion

Compose a Monday digest from last week's Opik metrics and create a Notion page in 'LLM Weekly'.✓ Скопировано

Инструменты

Что предоставляет этот MCP

Инструмент	Входные данные	Когда вызывать	Стоимость
list_projects	workspace_id?	Navigate your workspace	1 API call
list_traces	project, filter?, start?, end?, limit?	Find traces by time range or content	1 API call
get_trace	trace_id	Deep-dive a single trace	1 API call
get_prompt	name, version?	Read a prompt for editing or use in code	1 API call
create_prompt_version	name, template, message?	Commit a new prompt iteration	1 API call
create_dataset	name, items[]	Build an eval dataset	1 API call
get_metrics	project, metric, window	Monitor cost / latency / quality	1 API call

Стоимость и лимиты

Во что обходится

Квота API: Opik Cloud has per-plan limits; self-hosted is unlimited
Токенов на вызов: Trace listings 1k-5k tokens; single traces 500-3000
Деньги: Opik has a generous free tier; paid plans for scale. MCP itself is free (Apache 2.0).
Совет: Use list_traces with a time window; never call without a range on a busy project.

Безопасность

Права, секреты, радиус поражения

Минимальные скоупы: Opik API key scope the workspace you intend to expose

Хранение учётных данных: OPIK_API_KEY env var; HTTP transport uses Authorization: Bearer

Исходящий трафик: Traces may contain prompts/responses with PII — understand your Opik region and redaction setup

Никогда не давайте: An admin-scope key to a shared dev machine

Trace data is often PII-heavy (user prompts). Redact upstream before Opik ingests.
For self-hosted, the MCP needs network reach to your Opik backend — include it in your VPN/firewall plan.

Устранение неполадок

Частые ошибки и исправления

401 Unauthorized (Bearer)

Check OPIK_API_KEY. For self-hosted, also set --apiUrl http://host:5173/api.

Проверить: curl -H 'Authorization: Bearer $KEY' $URL/api/v1/workspaces

Empty trace list despite traffic

Wrong project / workspace. List projects first and confirm UUID.

Self-hosted MCP can't reach backend

Use container networking (same docker network) or map --apiUrl to an externally-reachable URL.

Альтернативы

opik-mcp в сравнении

Альтернатива	Когда использовать	Компромисс
LangSmith MCP	You use LangSmith for tracing	Different platform; similar capabilities
Langfuse MCP	You use Langfuse (OSS)	Also OSS + self-hostable; different schemas
Arize / Phoenix	You want focus on evals + drift detection	Richer ML-monitoring features; steeper learning curve

Ещё

Ресурсы

📖 Читать официальный README на GitHub

🐙 Открытые задачи

🔍 Все 400+ MCP-серверов и Skills