/ Directory / Playground / prometheus-mcp-server
● Community pab1it0 ⚡ Instant

prometheus-mcp-server

by pab1it0 · pab1it0/prometheus-mcp-server

Query Prometheus in natural language — PromQL instant + range queries, target inspection, metric metadata, for AI-assisted SRE.

prometheus-mcp-server (pab1it0) exposes 6 tools over the Prometheus HTTP API. Supports PromQL queries, range queries, metrics discovery, and target health. Works with basic auth, bearer tokens, mTLS, and custom headers.

Why use it

Key features

Live Demo

What it looks like in practice

prometheus.replay ▶ ready
0/0

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "prometheus",
      "command": "uvx",
      "args": [
        "prometheus-mcp-server"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "prometheus": {
      "command": {
        "path": "uvx",
        "args": [
          "prometheus-mcp-server"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add prometheus -- uvx prometheus-mcp-server

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use prometheus-mcp-server

How to diagnose a latency spike with Prometheus + Claude

👤 On-call SREs ⏱ ~10 min intermediate

When to use: A service p99 alert fires — you need context without memorizing PromQL.

Prerequisites
  • Prometheus URL reachable — Set PROMETHEUS_URL in the MCP config; add auth if protected
Flow
  1. Scope the spike
    Query http request p99 latency for service X in the last hour, 30-second resolution. Compare to the last 7 days baseline.✓ Copied
    → Range query result showing the spike
  2. Find correlated metrics
    For the spike window, what other metrics for service X moved >2 sigma? CPU, memory, GC, queue depth?✓ Copied
    → Candidate culprit metrics
  3. Narrow by label
    Break down the spike by pod/host labels. Is it one pod or fleet-wide?✓ Copied
    → Per-label decomposition

Outcome: A hypothesis tied to specific metrics in under 5 minutes.

Pitfalls
  • Query returns no data — Check label names with list_metrics — label casing and delimiters vary between exporters
Combine with: kubectl

Generate a weekly SLO compliance report from Prometheus

👤 SRE leads ⏱ ~25 min intermediate

When to use: Friday SLO review — you want numbers not vibes.

Flow
  1. Define the SLIs
    For service X, compute this week's availability (success/total ratio) and latency (requests under threshold / total) as numbers.✓ Copied
    → Two ratios with burn rate
  2. Compare to SLO
    Availability SLO = 99.9%, latency SLO = 95%. Am I above or below? Project error budget exhaustion.✓ Copied
    → Verdict + days of budget remaining

Outcome: Defensible SLO report with numbers not 'mostly fine'.

Combine with: google-sheets

Audit Prometheus scrape target health with Claude

👤 Platform engineers ⏱ ~15 min intermediate

When to use: You suspect half your targets are down but haven't checked.

Flow
  1. Get targets
    Call get_targets. Group by job; which have any DOWN instances?✓ Copied
    → Table of job → up/down counts
  2. Investigate
    For the worst offender, show the lastError for the DOWN instances. Likely cause?✓ Copied
    → Actionable cause per target

Outcome: Rescued scrapes in minutes.

Combinations

Pair with other MCPs for X10 leverage

prometheus + kubectl

Pair metric anomalies with pod state

For the service with the latency spike, correlate Prometheus data with kubectl describe on its pods.✓ Copied
prometheus + sentry

Metric spike + error spike correlation

Sentry shows errors doubled at 14:00 — what Prometheus metrics moved at the same time?✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
health_check Verify connectivity 1 API call
execute_query query: promql, time? Instant snapshot 1 query
execute_range_query query, start, end, step Time-series analysis 1 query (may be expensive)
list_metrics match?: str Discovery when you don't know the metric name 1 API call
get_metric_metadata metric: str Understand units before computing 1 API call
get_targets Scrape health 1 API call

Cost & Limits

What this costs to run

API quota
Prometheus scales with your server; expensive queries can stress it
Tokens per call
Range queries with many series can hit 10k+ tokens
Monetary
Free
Tip
Use step wisely on range queries; 10s resolution over 24h is 8640 samples per series

Security

Permissions, secrets, blast radius

Minimum scopes: read-only access to Prometheus API
Credential storage: Bearer token or basic auth in env; mTLS cert paths if used
Data egress: Your Prometheus URL only

Troubleshooting

Common errors and fixes

Query returns empty with no error

Metric/label name doesn't exist. Use list_metrics with a match prefix to verify

Range query times out

Reduce time range or increase step. Prometheus query engine has per-query resource limits

401 with bearer token

Token lacks read permission on /api/v1; check reverse proxy if Prometheus is behind one

Verify: curl -H 'Authorization: Bearer $T' $PROMETHEUS_URL/api/v1/status/config

Alternatives

prometheus-mcp-server vs others

AlternativeWhen to use it insteadTradeoff
Grafana MCPYou already visualize in Grafana and want dashboard/alert opsHeavier; more features than you may need
Datadog MCPDatadog is your metrics storePaid; different query language

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills