arXiv MCP — Установка & Живое демо

Зачем использовать

Ключевые функции

Search by query, category (cs.AI, cs.CL, stat.ML, etc.), date range, author
Download PDF to a local cache directory
Extract text from a downloaded paper for summarization or QA
List locally cached papers so you don't re-download
No API key — arXiv's query API is public

Живое демо

Как выглядит на практике

arxiv.replay ▶ готово

0/0

Установка

Выберите клиент

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Откройте Claude Desktop → Settings → Developer → Edit Config. Перезапустите после сохранения.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Cursor использует ту же схему mcpServers, что и Claude Desktop. Конфиг проекта приоритетнее глобального.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Щёлкните значок MCP Servers на боковой панели Cline, затем "Edit Configuration".

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  }
}

Тот же формат, что и Claude Desktop. Перезапустите Windsurf для применения.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "arxiv",
      "command": "uvx",
      "args": [
        "arxiv-mcp-server"
      ]
    }
  ]
}

Continue использует массив объектов серверов, а не map.

~/.config/zed/settings.json

{
  "context_servers": {
    "arxiv": {
      "command": {
        "path": "uvx",
        "args": [
          "arxiv-mcp-server"
        ]
      }
    }
  }
}

Добавьте в context_servers. Zed перезагружается автоматически.

claude mcp add arxiv -- uvx arxiv-mcp-server

Однострочная команда. Проверить: claude mcp list. Удалить: claude mcp remove.

Сценарии использования

Реальные сценарии: arXiv

Build a mini literature survey on a niche topic

👤 Researchers, grad students, curious engineers ⏱ ~25 min intermediate

Когда использовать: You're starting work on a topic (e.g. 'speculative decoding') and want the 10 most relevant recent papers with summaries.

Предварительные требования

Local cache dir writable — Default under user home; override via ARXIV_STORAGE_PATH

Поток

Search strategically

Search arXiv for 'speculative decoding' in cs.CL or cs.LG, last 12 months, sort by relevance. Top 20.✓ Скопировано

→ List of arxiv ids + titles + abstracts
Download the top candidates

Download the top 10 papers locally.✓ Скопировано

→ Papers cached; return local paths
Summarize each in one paragraph

For each downloaded paper, extract text and write a 4-line summary: problem, method, result, limitations. Preserve the arxiv id.✓ Скопировано

→ Structured summaries with citations

Итог: A 10-paper survey table ready for a related-work section or blog post.

Подводные камни

arXiv relevance sort is weak; you'll miss important papers sorted elsewhere — Also search sorted by submittedDate desc; triangulate via Semantic Scholar / Google Scholar for citation counts

Сочетать с: filesystem · qdrant

Deep-read a single paper with Q&A

👤 Anyone reading a dense paper ⏱ ~20 min beginner

Когда использовать: You have one specific paper (say, the FlashAttention-3 paper) and want to understand it without reading the full PDF alone.

Поток

Download the paper

Download arxiv paper 2405.12345. Report number of pages and total word count.✓ Скопировано

→ File cached + stats
Summarize by section

Read the paper. Give me a section-by-section summary. For each section: goal, key points, any equations worth understanding (in plain English).✓ Скопировано

→ Structured walkthrough
Ask targeted questions

Specific Q: [your question]. Answer only from the paper; cite the section and any equation numbers.✓ Скопировано

→ Grounded answer with cites

Итог: Paper-level understanding in 20 minutes instead of 2 hours.

Подводные камни

PDF extraction mangles equations and tables — For heavy-math papers, ask Claude to note 'equation extraction may be unreliable' and cross-check critical formulas against the PDF

Weekly digest of new papers in your field

👤 Academics, ML engineers tracking a subfield ⏱ ~15 min beginner

Когда использовать: Monday morning: 'what's new in cs.CL submitted in the last 7 days that's worth reading?'

Поток

Pull recent submissions

Search arXiv cs.CL submissions in the last 7 days. Return top 50 by relevance or arbitrary.✓ Скопировано

→ Recent papers list
Filter by keywords you care about

Keep only papers whose title or abstract mentions [your keywords]. Dedupe.✓ Скопировано

→ Narrowed shortlist
Abstract digest

For each kept paper, generate 2-line 'why it might matter' from the abstract. Mark 3 as must-reads.✓ Скопировано

→ Weekly digest

Итог: A curated weekly reading list without doomscrolling arxiv-sanity.

Подводные камни

Abstracts oversell; 'must-read' tag can be wrong — Treat the tag as a prompt to read the abstract yourself, not as endorsement

Сочетать с: notion

Комбинации

Сочетайте с другими MCP — эффект x10

arxiv + qdrant

Build a searchable library of papers for semantic recall

Download the top 30 papers on 'mixture of experts'. Index each chunk into Qdrant collection papers_moe. Later answer: 'what tricks do MoE papers use for load balancing?'✓ Скопировано

arxiv + filesystem

Write a markdown survey file with inline citations

Download 10 papers on topic X, save summaries to /research/survey-X.md with [arxiv:id] links.✓ Скопировано

arxiv + notion

Post a weekly paper digest to a Notion research DB

Run the weekly digest for cs.CL, create a Notion page with the 5 must-reads as rows.✓ Скопировано

Инструменты

Что предоставляет этот MCP

Инструмент	Входные данные	Когда вызывать	Стоимость
search_papers	query: str, category?, max_results?, date_range?	Discover relevant papers by query/category/date	free
download_paper	paper_id	Cache a PDF locally for extraction	free
read_paper	paper_id	Extract text from a cached paper for reading/QA	free
list_papers		See what's already downloaded to avoid re-fetch	free

Стоимость и лимиты

Во что обходится

Квота API: arXiv query API recommends ~1 req/3s; higher rates may get throttled
Токенов на вызов: Search: 500–2000 tokens. Paper text: 5k–30k tokens per paper.
Деньги: Free
Совет: Cache aggressively; re-reading a paper's extracted text is free once downloaded.

Безопасность

Права, секреты, радиус поражения

Хранение учётных данных: None needed

Исходящий трафик: Queries to export.arxiv.org; PDF downloads from arxiv.org

Respect arXiv's 1 req / 3s recommendation; don't parallelize aggressively.
Only cache papers with proper arXiv ids; do not mirror the full archive.

Устранение неполадок

Частые ошибки и исправления

Empty search results for a clearly existing topic

arXiv search is keyword-exact for quoted strings; try broader terms and the correct category prefix (cs.CL vs cs.AI).

Download failed / PDF unavailable

Very rare; some withdrawn papers 404. Confirm the id on arxiv.org/abs/<id>.

Extracted text is garbled

Some math-heavy papers have non-standard PDFs; try the source version if available, else note the limitation.

Альтернативы

arXiv в сравнении

Альтернатива	Когда использовать	Компромисс
Semantic Scholar MCP	You need citation counts and influence metrics	Not arXiv-specific; coverage varies
Papers with Code MCP	You want papers with code implementations linked	Smaller catalog, ML-focused

Ещё

Ресурсы

📖 Читать официальный README на GitHub

🐙 Открытые задачи

🔍 Все 400+ MCP-серверов и Skills