Jina AI MCP — インストール & ライブデモ

なぜ使うのか

主な機能

ファーストパーティ — 公式Jina AI MCP
read_url はクリーンなマークダウンを返します — JS でレンダリングされたサイトを処理します
1 つのインターフェイスで Web、arXiv、SSRN、画像、BibTeX を横断的に検索
処理ツール: 再ランク付け、分類、重複排除 (テキストと画像)、extract_pdf

ライブデモ

実際の動作

jina.replay ▶ 準備完了

0/0

インストール

クライアントを選択

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "jina": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://mcp.jina.ai/sse"
      ]
    }
  }
}

Claude Desktop → Settings → Developer → Edit Config を開く。保存後、アプリを再起動。

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "jina": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://mcp.jina.ai/sse"
      ]
    }
  }
}

Cursor は Claude Desktop と同じ mcpServers スキーマを使用。プロジェクト設定はグローバルより優先。

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "jina": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://mcp.jina.ai/sse"
      ]
    }
  }
}

Cline サイドバーの MCP Servers アイコンをクリックし、"Edit Configuration" を選択。

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "jina": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://mcp.jina.ai/sse"
      ]
    }
  }
}

Claude Desktop と同じ形式。Windsurf を再起動して反映。

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "jina",
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://mcp.jina.ai/sse"
      ]
    }
  ]
}

Continue はマップではなくサーバーオブジェクトの配列を使用。

~/.config/zed/settings.json

{
  "context_servers": {
    "jina": {
      "command": {
        "path": "npx",
        "args": [
          "-y",
          "mcp-remote",
          "https://mcp.jina.ai/sse"
        ]
      }
    }
  }
}

context_servers に追加。保存時に Zed がホットリロード。

claude mcp add jina -- npx -y mcp-remote https://mcp.jina.ai/sse

ワンライナー。claude mcp list で確認、claude mcp remove で削除。

ユースケース

実用的な使い方： Jina AI

あるトピックに関する最近の arXiv 論文のダイジェスト

👤 研究者、ML エンジニアは最新の状態を維持 ⏱ ~20 min intermediate

使うタイミング： 50 件の要約を読まなくても、自分のトピックに関する arXiv の最新情報を知りたいと考えています。

前提条件

オプションのJina APIキー — jina.ai → ダッシュボード → API キー (無料枠は軽い使用に適しています)

フロー

arXiv を検索

search_arxiv を使用して、「LLM 推論のための投機的デコーディング」に関する過去 30 日間の論文を検索します。トップ 20 を返します。✓ コピーしました

→ タイトル、著者、要約を含む論文リスト
関連性による再ランク付け

sort_by_relevance を使用して、このクエリに対して再ランク付けします。「純粋な研究ではなく、本番推論における実用的なスピードアップ」。上位8位を維持する。✓ コピーしました

→ 再ランク付けされたリスト
それぞれを要約します

上位 8 については、論文を extract_pdf にして、貢献度、方法、報告された高速化の 3 つの箇条書きにまとめます。マークダウンテーブルとして出力します。✓ コピーしました

→ ダイジェスト対応の要約テーブル

結果： あなたのトピックに関する毎週のリサーチのダイジェストを 10 分でご覧いただけます。

注意点

すべての結果の extract_pdf は高価です - クレジットは加算されます — 最初に再ランク付けして候補を絞り込み、上位 N 個のみを抽出します

組み合わせ： notion

URL のバッチを RAG のクリーンなマークダウンに変換します

👤 検索システムを構築する AI エンジニア ⏱ ~15 min intermediate

使うタイミング： 取り込む URL のリストがあります。生の HTML や解析パイプラインではなく、クリーンなマークダウンが必要です。

フロー

URLを並行して読み取る

このリスト [URL] では、Parallel_read_url を使用します。元の URL をキーとしてそれぞれのマークダウンを返します。✓ コピーしました

→ URLごとのマークダウン
準重複の重複排除

類似度 0.9 の deduplicate_strings を使用して、重複に近いページを削除します (ミラードキュメントで一般的)。✓ コピーしました

→ ドロップされたページの ID を含む重複排除セット
ディスクに保存

それぞれを ./knowledge/<slug>.md に保存します。スラッグは URL パスから派生します。✓ コピーしました

→ Markdown files ready for embedding pipeline

結果： スクレイピングコードを書かずに、埋め込み/インデックス作成ステップのためのクリーンなコーパスを作成します。

注意点

Paywalled or JS-auth-walled pages return blank/garbage — Spot check a few URLs — if the content is thin, fall back to playwright for auth flows

組み合わせ： filesystem · firecrawl

Classify a batch of text with custom labels

👤 Data analysts, growth teams ⏱ ~15 min beginner

使うタイミング： You have N free-text items (tickets, reviews, survey responses) and want them bucketed into your taxonomy.

フロー

Define labels

My labels: ['bug', 'feature_request', 'question', 'praise', 'other']. Sample the first 10 items and sanity-check the labels fit.✓ コピーしました

→ Labels validated against samples
Batch classify

Use classify_text on all items with those labels. Return {id, text, label, confidence}.✓ コピーしました

→ Labelled dataset
Review low-confidence

Flag items where confidence < 0.6 for manual review. Summarize: distribution, outliers, likely missing labels.✓ コピーしました

→ Review queue + taxonomy feedback

結果： A labeled dataset without fine-tuning a classifier or writing prompts per item.

注意点

Labels are ambiguous and classifier flip-flops on near-ties — Make labels mutually exclusive; if items span categories, allow multi-label output

組み合わせ： filesystem

組み合わせ

他のMCPと組み合わせて10倍の力を

jina + notion

Weekly research digest posted to Notion

Search arXiv for new 'agentic RAG' papers this week. Summarize each and create a Notion page in the Research Digest database.✓ コピーしました

jina + firecrawl

Jina for single pages, Firecrawl for full crawls — same clean-markdown output

For the list of URLs, use parallel_read_url (Jina). For the 3 full docs sites, use Firecrawl crawl. Merge into one knowledge dir.✓ コピーしました

jina + filesystem

Build a local markdown knowledge base from a reading list

Read each URL in urls.txt, dedupe, save to ./knowledge/<hash>.md. Overwrite only if content changed.✓ コピーしました

ツール

このMCPが提供する機能

ツール	入力	呼び出すタイミング	コスト
search_web	query, num_results?	General web search	credits per call
search_arxiv / search_ssrn / search_bibtex / search_images / search_jina_blog	query	Targeted searches	credits per call
parallel_search_web / parallel_search_arxiv / parallel_search_ssrn	query[]	Multi-query research in one call	credits × N queries
read_url	url	Clean content extraction from any URL	credits per page
parallel_read_url	url[]	Batch webpage ingestion	credits × N pages
capture_screenshot_url	url	Visual snapshot of a page	credits
sort_by_relevance	documents, query	Rerank after search for quality	credits
classify_text	texts, labels	Zero-shot classification	credits per text
deduplicate_strings / deduplicate_images	items, threshold	Remove near-duplicates from a corpus	credits
extract_pdf	url or file	Get structured content from PDFs	credits per PDF
expand_query / primer / guess_datetime_url	utility	Helpers around search tuning	credits (minor)

コストと制限

運用コスト

APIクォータ: Free tier available with rate limits; paid tiers scale
呼び出しあたりのトークン: Output is the bigger cost — PDFs and dedupes can return 10k+ tokens
金額: Jina API credits, typically measured per-request. See jina.ai/pricing.
ヒント: Rerank before extracting — extract_pdf is expensive. Cache read_url outputs locally; most pages don't change daily.

セキュリティ

権限、シークレット、影響範囲

認証情報の保管： JINA_API_KEY env var (optional for many tools, required for heavy use)

データ送信先： All calls to api.jina.ai / r.jina.ai / s.jina.ai — queries and URLs visible to Jina

トラブルシューティング

よくあるエラーと対処法

429 Too Many Requests

Free tier has low rate limits. Add a JINA_API_KEY env var and upgrade at jina.ai for burst capacity.

read_url returns empty markdown

Page may be auth-walled or bot-blocked. Try with different User-Agent via tool options, or fall back to playwright/firecrawl.

classify_text assigns everything to 'other'

Your labels may be too narrow or too similar. Add label descriptions ('bug: user reports something broken') for better zero-shot accuracy.

search_arxiv misses recent papers

arXiv index may lag; cross-check with a direct arxiv.org search. Use expand_query to broaden terms.

代替案

Jina AI 他との比較

代替案	代わりに使う場面	トレードオフ
Firecrawl	You need full-site crawls or JSON-schema extraction	Crawl-focused; Jina's superpower is the breadth of processing tools beyond just reading
Exa Search MCP	You want semantic/neural web search as a primary interface	Stronger on semantic retrieval; narrower than Jina's toolbox
Brave Search MCP	You want independent search index + privacy	Search only, no reader/rerank/classify

その他

リソース

📖 GitHub の公式 README を読む

🐙 オープンな issue を見る

🔍 400以上のMCPサーバーとSkillsを見る