/ Directory / Playground / arXiv
● Community blazickjp ⚡ Instant

arXiv

by blazickjp · blazickjp/arxiv-mcp-server

Search arXiv, download papers, and let Claude read + summarize them — a lightweight research assistant for the latest preprints.

The arxiv-mcp-server lets Claude search arXiv by keyword/category/date, download PDFs, and extract their text for in-chat reading. No API key; arXiv is fully public. Ideal for literature surveys, paper summarization, and keeping up with fast-moving ML/physics/CS subfields.

Why use it

Key features

Live Demo

What it looks like in practice

arxiv.replay ▶ ready
0/0

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "arxiv",
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "arxiv": {
      "command": {
        "path": "uvx",
        "args": [
          "arxiv-mcp-server"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add arxiv -- uvx arxiv-mcp-server

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use arXiv

Build a mini literature survey on a niche topic

👤 Researchers, grad students, curious engineers ⏱ ~25 min intermediate

When to use: You're starting work on a topic (e.g. 'speculative decoding') and want the 10 most relevant recent papers with summaries.

Prerequisites
  • Local cache dir writable — Default under user home; override via ARXIV_STORAGE_PATH
Flow
  1. Search strategically
    Search arXiv for 'speculative decoding' in cs.CL or cs.LG, last 12 months, sort by relevance. Top 20.✓ Copied
    → List of arxiv ids + titles + abstracts
  2. Download the top candidates
    Download the top 10 papers locally.✓ Copied
    → Papers cached; return local paths
  3. Summarize each in one paragraph
    For each downloaded paper, extract text and write a 4-line summary: problem, method, result, limitations. Preserve the arxiv id.✓ Copied
    → Structured summaries with citations

Outcome: A 10-paper survey table ready for a related-work section or blog post.

Pitfalls
  • arXiv relevance sort is weak; you'll miss important papers sorted elsewhere — Also search sorted by submittedDate desc; triangulate via Semantic Scholar / Google Scholar for citation counts
Combine with: filesystem · qdrant

Deep-read a single paper with Q&A

👤 Anyone reading a dense paper ⏱ ~20 min beginner

When to use: You have one specific paper (say, the FlashAttention-3 paper) and want to understand it without reading the full PDF alone.

Flow
  1. Download the paper
    Download arxiv paper 2405.12345. Report number of pages and total word count.✓ Copied
    → File cached + stats
  2. Summarize by section
    Read the paper. Give me a section-by-section summary. For each section: goal, key points, any equations worth understanding (in plain English).✓ Copied
    → Structured walkthrough
  3. Ask targeted questions
    Specific Q: [your question]. Answer only from the paper; cite the section and any equation numbers.✓ Copied
    → Grounded answer with cites

Outcome: Paper-level understanding in 20 minutes instead of 2 hours.

Pitfalls
  • PDF extraction mangles equations and tables — For heavy-math papers, ask Claude to note 'equation extraction may be unreliable' and cross-check critical formulas against the PDF

Weekly digest of new papers in your field

👤 Academics, ML engineers tracking a subfield ⏱ ~15 min beginner

When to use: Monday morning: 'what's new in cs.CL submitted in the last 7 days that's worth reading?'

Flow
  1. Pull recent submissions
    Search arXiv cs.CL submissions in the last 7 days. Return top 50 by relevance or arbitrary.✓ Copied
    → Recent papers list
  2. Filter by keywords you care about
    Keep only papers whose title or abstract mentions [your keywords]. Dedupe.✓ Copied
    → Narrowed shortlist
  3. Abstract digest
    For each kept paper, generate 2-line 'why it might matter' from the abstract. Mark 3 as must-reads.✓ Copied
    → Weekly digest

Outcome: A curated weekly reading list without doomscrolling arxiv-sanity.

Pitfalls
  • Abstracts oversell; 'must-read' tag can be wrong — Treat the tag as a prompt to read the abstract yourself, not as endorsement
Combine with: notion

Combinations

Pair with other MCPs for X10 leverage

arxiv + qdrant

Build a searchable library of papers for semantic recall

Download the top 30 papers on 'mixture of experts'. Index each chunk into Qdrant collection papers_moe. Later answer: 'what tricks do MoE papers use for load balancing?'✓ Copied
arxiv + filesystem

Write a markdown survey file with inline citations

Download 10 papers on topic X, save summaries to /research/survey-X.md with [arxiv:id] links.✓ Copied
arxiv + notion

Post a weekly paper digest to a Notion research DB

Run the weekly digest for cs.CL, create a Notion page with the 5 must-reads as rows.✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
search_papers query: str, category?, max_results?, date_range? Discover relevant papers by query/category/date free
download_paper paper_id Cache a PDF locally for extraction free
read_paper paper_id Extract text from a cached paper for reading/QA free
list_papers See what's already downloaded to avoid re-fetch free

Cost & Limits

What this costs to run

API quota
arXiv query API recommends ~1 req/3s; higher rates may get throttled
Tokens per call
Search: 500–2000 tokens. Paper text: 5k–30k tokens per paper.
Monetary
Free
Tip
Cache aggressively; re-reading a paper's extracted text is free once downloaded.

Security

Permissions, secrets, blast radius

Credential storage: None needed
Data egress: Queries to export.arxiv.org; PDF downloads from arxiv.org

Troubleshooting

Common errors and fixes

Empty search results for a clearly existing topic

arXiv search is keyword-exact for quoted strings; try broader terms and the correct category prefix (cs.CL vs cs.AI).

Download failed / PDF unavailable

Very rare; some withdrawn papers 404. Confirm the id on arxiv.org/abs/<id>.

Extracted text is garbled

Some math-heavy papers have non-standard PDFs; try the source version if available, else note the limitation.

Alternatives

arXiv vs others

AlternativeWhen to use it insteadTradeoff
Semantic Scholar MCPYou need citation counts and influence metricsNot arXiv-specific; coverage varies
Papers with Code MCPYou want papers with code implementations linkedSmaller catalog, ML-focused

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills