/ Каталог / Песочница / opik-mcp
● Сообщество comet-ml ⚡ Сразу

opik-mcp

автор comet-ml · comet-ml/opik-mcp

Comet's official Opik MCP — manage prompts, projects, traces, and metrics of your LLM apps from Claude or Cursor without switching tabs.

Opik is an LLM observability platform (prompts, traces, evals, datasets). This official MCP gives your IDE/agent access to those primitives: list traces, pull prompts, create datasets, inspect metrics. Works with Opik Cloud or self-hosted.

Зачем использовать

Ключевые функции

Живое демо

Как выглядит на практике

opik.replay ▶ готово
0/0

Установка

Выберите клиент

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "opik": {
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ],
      "_inferred": true
    }
  }
}

Откройте Claude Desktop → Settings → Developer → Edit Config. Перезапустите после сохранения.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "opik": {
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ],
      "_inferred": true
    }
  }
}

Cursor использует ту же схему mcpServers, что и Claude Desktop. Конфиг проекта приоритетнее глобального.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "opik": {
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ],
      "_inferred": true
    }
  }
}

Щёлкните значок MCP Servers на боковой панели Cline, затем "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "opik": {
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ],
      "_inferred": true
    }
  }
}

Тот же формат, что и Claude Desktop. Перезапустите Windsurf для применения.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "opik",
      "command": "npx",
      "args": [
        "-y",
        "opik-mcp"
      ]
    }
  ]
}

Continue использует массив объектов серверов, а не map.

~/.config/zed/settings.json
{
  "context_servers": {
    "opik": {
      "command": {
        "path": "npx",
        "args": [
          "-y",
          "opik-mcp"
        ]
      }
    }
  }
}

Добавьте в context_servers. Zed перезагружается автоматически.

claude mcp add opik -- npx -y opik-mcp

Однострочная команда. Проверить: claude mcp list. Удалить: claude mcp remove.

Сценарии использования

Реальные сценарии: opik-mcp

Pull a production trace into your IDE to debug a bad LLM response

👤 LLM app developers ⏱ ~15 min intermediate

Когда использовать: A user reports a wrong answer; the trace is in Opik; you want to inspect it without leaving Cursor.

Предварительные требования
  • Opik API key — comet.com/site > API Keys (or self-hosted admin)
Поток
  1. Find the trace
    Search traces in project 'prod-chatbot' where output contains 'I cannot help with that'. Last 24h.✓ Скопировано
    → Matching trace IDs + timestamps
  2. Inspect
    Open trace ID abc123. Show me the full message chain, tools called, and intermediate reasoning.✓ Скопировано
    → Full trace object
  3. Form hypothesis
    Why might the model have refused? Compare this trace to a successful one on the same prompt template.✓ Скопировано
    → Diff + hypothesis

Итог: Faster trace-driven debugging without app-switching.

Подводные камни
  • PII in traces — Configure Opik's redaction before enabling MCP access broadly

Iterate on a prompt template with version tracking

👤 Prompt engineers ⏱ ~25 min advanced

Когда использовать: You're tuning a system prompt and want each version saved to Opik for rollback.

Поток
  1. Pull current version
    Get latest version of prompt 'support-agent-system'.✓ Скопировано
    → Current prompt body
  2. Edit and commit
    Propose a change to handle escalations better. Show diff. Commit as a new version with message 'add escalation path'.✓ Скопировано
    → Diff + new version ID
  3. Eval against dataset
    Run this new version against dataset 'support-eval-v1'. Compare pass rate vs previous version.✓ Скопировано
    → Metric comparison

Итог: Data-driven prompt changes, version-controlled.

Подводные камни
  • No guardrails — a regressive prompt becomes prod — Use Opik's experiment framework: don't promote until pass rate ≥ baseline

Generate a weekly LLM app health report

👤 Eng leads, LLM app PMs ⏱ ~30 min intermediate

Когда использовать: You want a Monday-morning digest of cost, latency, error rate, and top failure categories.

Поток
  1. Pull last week's metrics
    For project 'prod-chatbot': total traces, total tokens, avg latency p50/p95, error count — over last 7 days.✓ Скопировано
    → Metrics block
  2. Classify failures
    Sample 20 failed traces. Cluster by failure mode. Rank clusters by frequency.✓ Скопировано
    → Failure taxonomy
  3. Write the digest
    Compose a Markdown digest with the metrics and top 3 failure modes, ready for Slack.✓ Скопировано
    → Shareable report

Итог: Weekly LLM ops awareness without manual dashboard time.

Подводные камни
  • Metric drift as your app evolves — Version the report template; compare apples to apples week over week
Сочетать с: notion

Комбинации

Сочетайте с другими MCP — эффект x10

opik + github

When a prompt regresses, open a GitHub issue with the failing trace

If pass rate drops >5% on 'support-eval-v1' vs last week, create a GitHub issue with the top 3 failing trace IDs.✓ Скопировано
opik + notion

Publish weekly LLM health digest to Notion

Compose a Monday digest from last week's Opik metrics and create a Notion page in 'LLM Weekly'.✓ Скопировано

Инструменты

Что предоставляет этот MCP

ИнструментВходные данныеКогда вызыватьСтоимость
list_projects workspace_id? Navigate your workspace 1 API call
list_traces project, filter?, start?, end?, limit? Find traces by time range or content 1 API call
get_trace trace_id Deep-dive a single trace 1 API call
get_prompt name, version? Read a prompt for editing or use in code 1 API call
create_prompt_version name, template, message? Commit a new prompt iteration 1 API call
create_dataset name, items[] Build an eval dataset 1 API call
get_metrics project, metric, window Monitor cost / latency / quality 1 API call

Стоимость и лимиты

Во что обходится

Квота API
Opik Cloud has per-plan limits; self-hosted is unlimited
Токенов на вызов
Trace listings 1k-5k tokens; single traces 500-3000
Деньги
Opik has a generous free tier; paid plans for scale. MCP itself is free (Apache 2.0).
Совет
Use list_traces with a time window; never call without a range on a busy project.

Безопасность

Права, секреты, радиус поражения

Минимальные скоупы: Opik API key scope the workspace you intend to expose
Хранение учётных данных: OPIK_API_KEY env var; HTTP transport uses Authorization: Bearer
Исходящий трафик: Traces may contain prompts/responses with PII — understand your Opik region and redaction setup
Никогда не давайте: An admin-scope key to a shared dev machine

Устранение неполадок

Частые ошибки и исправления

401 Unauthorized (Bearer)

Check OPIK_API_KEY. For self-hosted, also set --apiUrl http://host:5173/api.

Проверить: curl -H 'Authorization: Bearer $KEY' $URL/api/v1/workspaces
Empty trace list despite traffic

Wrong project / workspace. List projects first and confirm UUID.

Self-hosted MCP can't reach backend

Use container networking (same docker network) or map --apiUrl to an externally-reachable URL.

Альтернативы

opik-mcp в сравнении

АльтернативаКогда использоватьКомпромисс
LangSmith MCPYou use LangSmith for tracingDifferent platform; similar capabilities
Langfuse MCPYou use Langfuse (OSS)Also OSS + self-hostable; different schemas
Arize / PhoenixYou want focus on evals + drift detectionRicher ML-monitoring features; steeper learning curve

Ещё

Ресурсы

📖 Читать официальный README на GitHub

🐙 Открытые задачи

🔍 Все 400+ MCP-серверов и Skills