arXiv MCP — Install & Live Demo

Why use it

Key features

Search by query, category (cs.AI, cs.CL, stat.ML, etc.), date range, author
Download PDF to a local cache directory
Extract text from a downloaded paper for summarization or QA
List locally cached papers so you don't re-download
No API key — arXiv's query API is public

Live Demo

What it looks like in practice

arxiv.replay ▶ ready

0/0

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "arxiv",
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json

{
  "context_servers": {
    "arxiv": {
      "command": {
        "path": "uvx",
        "args": [
          "arxiv-mcp-server"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add arxiv -- uvx arxiv-mcp-server

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use arXiv

Build a mini literature survey on a niche topic

👤 Researchers, grad students, curious engineers ⏱ ~25 min intermediate

When to use: You're starting work on a topic (e.g. 'speculative decoding') and want the 10 most relevant recent papers with summaries.

Prerequisites

Local cache dir writable — Default under user home; override via ARXIV_STORAGE_PATH

Flow

Search strategically

Search arXiv for 'speculative decoding' in cs.CL or cs.LG, last 12 months, sort by relevance. Top 20.✓ Copied

→ List of arxiv ids + titles + abstracts
Download the top candidates

Download the top 10 papers locally.✓ Copied

→ Papers cached; return local paths
Summarize each in one paragraph

For each downloaded paper, extract text and write a 4-line summary: problem, method, result, limitations. Preserve the arxiv id.✓ Copied

→ Structured summaries with citations

Outcome: A 10-paper survey table ready for a related-work section or blog post.

Pitfalls

arXiv relevance sort is weak; you'll miss important papers sorted elsewhere — Also search sorted by submittedDate desc; triangulate via Semantic Scholar / Google Scholar for citation counts

Combine with: filesystem · qdrant

Deep-read a single paper with Q&A

👤 Anyone reading a dense paper ⏱ ~20 min beginner

When to use: You have one specific paper (say, the FlashAttention-3 paper) and want to understand it without reading the full PDF alone.

Flow

Download the paper

Download arxiv paper 2405.12345. Report number of pages and total word count.✓ Copied

→ File cached + stats
Summarize by section

Read the paper. Give me a section-by-section summary. For each section: goal, key points, any equations worth understanding (in plain English).✓ Copied

→ Structured walkthrough
Ask targeted questions

Specific Q: [your question]. Answer only from the paper; cite the section and any equation numbers.✓ Copied

→ Grounded answer with cites

Outcome: Paper-level understanding in 20 minutes instead of 2 hours.

Pitfalls

PDF extraction mangles equations and tables — For heavy-math papers, ask Claude to note 'equation extraction may be unreliable' and cross-check critical formulas against the PDF

Weekly digest of new papers in your field

👤 Academics, ML engineers tracking a subfield ⏱ ~15 min beginner

When to use: Monday morning: 'what's new in cs.CL submitted in the last 7 days that's worth reading?'

Flow

Pull recent submissions

Search arXiv cs.CL submissions in the last 7 days. Return top 50 by relevance or arbitrary.✓ Copied

→ Recent papers list
Filter by keywords you care about

Keep only papers whose title or abstract mentions [your keywords]. Dedupe.✓ Copied

→ Narrowed shortlist
Abstract digest

For each kept paper, generate 2-line 'why it might matter' from the abstract. Mark 3 as must-reads.✓ Copied

→ Weekly digest

Outcome: A curated weekly reading list without doomscrolling arxiv-sanity.

Pitfalls

Abstracts oversell; 'must-read' tag can be wrong — Treat the tag as a prompt to read the abstract yourself, not as endorsement

Combine with: notion

Combinations

Pair with other MCPs for X10 leverage

arxiv + qdrant

Build a searchable library of papers for semantic recall

Download the top 30 papers on 'mixture of experts'. Index each chunk into Qdrant collection papers_moe. Later answer: 'what tricks do MoE papers use for load balancing?'✓ Copied

arxiv + filesystem

Write a markdown survey file with inline citations

Download 10 papers on topic X, save summaries to /research/survey-X.md with [arxiv:id] links.✓ Copied

arxiv + notion

Post a weekly paper digest to a Notion research DB

Run the weekly digest for cs.CL, create a Notion page with the 5 must-reads as rows.✓ Copied

Tools

What this MCP exposes

Tool	Inputs	When to call	Cost
search_papers	query: str, category?, max_results?, date_range?	Discover relevant papers by query/category/date	free
download_paper	paper_id	Cache a PDF locally for extraction	free
read_paper	paper_id	Extract text from a cached paper for reading/QA	free
list_papers		See what's already downloaded to avoid re-fetch	free

Cost & Limits

What this costs to run

API quota: arXiv query API recommends ~1 req/3s; higher rates may get throttled
Tokens per call: Search: 500–2000 tokens. Paper text: 5k–30k tokens per paper.
Monetary: Free
Tip: Cache aggressively; re-reading a paper's extracted text is free once downloaded.

Security

Permissions, secrets, blast radius

Credential storage: None needed

Data egress: Queries to export.arxiv.org; PDF downloads from arxiv.org

Respect arXiv's 1 req / 3s recommendation; don't parallelize aggressively.
Only cache papers with proper arXiv ids; do not mirror the full archive.

Troubleshooting

Common errors and fixes

Empty search results for a clearly existing topic

arXiv search is keyword-exact for quoted strings; try broader terms and the correct category prefix (cs.CL vs cs.AI).

Download failed / PDF unavailable

Very rare; some withdrawn papers 404. Confirm the id on arxiv.org/abs/<id>.

Extracted text is garbled

Some math-heavy papers have non-standard PDFs; try the source version if available, else note the limitation.

Alternatives

arXiv vs others

Alternative	When to use it instead	Tradeoff
Semantic Scholar MCP	You need citation counts and influence metrics	Not arXiv-specific; coverage varies
Papers with Code MCP	You want papers with code implementations linked	Smaller catalog, ML-focused

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills