/ Diretório / Playground / prometheus-mcp-server
● Comunidade pab1it0 ⚡ Instantâneo

prometheus-mcp-server

por pab1it0 · pab1it0/prometheus-mcp-server

Query Prometheus in natural language — PromQL instant + range queries, target inspection, metric metadata, for AI-assisted SRE.

prometheus-mcp-server (pab1it0) exposes 6 tools over the Prometheus HTTP API. Supports PromQL queries, range queries, metrics discovery, and target health. Works with basic auth, bearer tokens, mTLS, and custom headers.

Por que usar

Principais recursos

Demo ao vivo

Como fica na prática

prometheus.replay ▶ pronto
0/0

Instalar

Escolha seu cliente

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ],
      "_inferred": true
    }
  }
}

Abra Claude Desktop → Settings → Developer → Edit Config. Reinicie após salvar.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ],
      "_inferred": true
    }
  }
}

Cursor usa o mesmo esquema mcpServers que o Claude Desktop. Config de projeto vence a global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ],
      "_inferred": true
    }
  }
}

Clique no ícone MCP Servers na barra lateral do Cline, depois "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ],
      "_inferred": true
    }
  }
}

Mesmo formato do Claude Desktop. Reinicie o Windsurf para aplicar.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "prometheus",
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ]
    }
  ]
}

O Continue usa um array de objetos de servidor em vez de um map.

~/.config/zed/settings.json
{
  "context_servers": {
    "prometheus": {
      "command": {
        "path": "uvx",
        "args": [
          "prometheus-mcp-server"
        ]
      }
    }
  }
}

Adicione em context_servers. Zed recarrega automaticamente ao salvar.

claude mcp add prometheus -- uvx prometheus-mcp-server

Uma linha só. Verifique com claude mcp list. Remova com claude mcp remove.

Casos de uso

Usos do mundo real: prometheus-mcp-server

How to diagnose a latency spike with Prometheus + Claude

👤 On-call SREs ⏱ ~10 min intermediate

Quando usar: A service p99 alert fires — you need context without memorizing PromQL.

Pré-requisitos
  • Prometheus URL reachable — Set PROMETHEUS_URL in the MCP config; add auth if protected
Fluxo
  1. Scope the spike
    Query http request p99 latency for service X in the last hour, 30-second resolution. Compare to the last 7 days baseline.✓ Copiado
    → Range query result showing the spike
  2. Find correlated metrics
    For the spike window, what other metrics for service X moved >2 sigma? CPU, memory, GC, queue depth?✓ Copiado
    → Candidate culprit metrics
  3. Narrow by label
    Break down the spike by pod/host labels. Is it one pod or fleet-wide?✓ Copiado
    → Per-label decomposition

Resultado: A hypothesis tied to specific metrics in under 5 minutes.

Armadilhas
  • Query returns no data — Check label names with list_metrics — label casing and delimiters vary between exporters
Combine com: kubectl

Generate a weekly SLO compliance report from Prometheus

👤 SRE leads ⏱ ~25 min intermediate

Quando usar: Friday SLO review — you want numbers not vibes.

Fluxo
  1. Define the SLIs
    For service X, compute this week's availability (success/total ratio) and latency (requests under threshold / total) as numbers.✓ Copiado
    → Two ratios with burn rate
  2. Compare to SLO
    Availability SLO = 99.9%, latency SLO = 95%. Am I above or below? Project error budget exhaustion.✓ Copiado
    → Verdict + days of budget remaining

Resultado: Defensible SLO report with numbers not 'mostly fine'.

Combine com: google-sheets

Audit Prometheus scrape target health with Claude

👤 Platform engineers ⏱ ~15 min intermediate

Quando usar: You suspect half your targets are down but haven't checked.

Fluxo
  1. Get targets
    Call get_targets. Group by job; which have any DOWN instances?✓ Copiado
    → Table of job → up/down counts
  2. Investigate
    For the worst offender, show the lastError for the DOWN instances. Likely cause?✓ Copiado
    → Actionable cause per target

Resultado: Rescued scrapes in minutes.

Combinações

Combine com outros MCPs para 10× de alavancagem

prometheus + kubectl

Pair metric anomalies with pod state

For the service with the latency spike, correlate Prometheus data with kubectl describe on its pods.✓ Copiado
prometheus + sentry

Metric spike + error spike correlation

Sentry shows errors doubled at 14:00 — what Prometheus metrics moved at the same time?✓ Copiado

Ferramentas

O que este MCP expõe

FerramentaEntradasQuando chamarCusto
health_check Verify connectivity 1 API call
execute_query query: promql, time? Instant snapshot 1 query
execute_range_query query, start, end, step Time-series analysis 1 query (may be expensive)
list_metrics match?: str Discovery when you don't know the metric name 1 API call
get_metric_metadata metric: str Understand units before computing 1 API call
get_targets Scrape health 1 API call

Custo e limites

O que custa rodar

Cota de API
Prometheus scales with your server; expensive queries can stress it
Tokens por chamada
Range queries with many series can hit 10k+ tokens
Monetário
Free
Dica
Use step wisely on range queries; 10s resolution over 24h is 8640 samples per series

Segurança

Permissões, segredos, alcance

Escopos mínimos: read-only access to Prometheus API
Armazenamento de credenciais: Bearer token or basic auth in env; mTLS cert paths if used
Saída de dados: Your Prometheus URL only

Solução de problemas

Erros comuns e correções

Query returns empty with no error

Metric/label name doesn't exist. Use list_metrics with a match prefix to verify

Range query times out

Reduce time range or increase step. Prometheus query engine has per-query resource limits

401 with bearer token

Token lacks read permission on /api/v1; check reverse proxy if Prometheus is behind one

Verificar: curl -H 'Authorization: Bearer $T' $PROMETHEUS_URL/api/v1/status/config

Alternativas

prometheus-mcp-server vs. outros

AlternativaQuando usarTroca
Grafana MCPYou already visualize in Grafana and want dashboard/alert opsHeavier; more features than you may need
Datadog MCPDatadog is your metrics storePaid; different query language

Mais

Recursos

📖 Leia o README oficial no GitHub

🐙 Ver issues abertas

🔍 Ver todos os 400+ servidores MCP e Skills