How to make a new agent productive in a large repo fast
Когда использовать: The agent wastes 30% of context reading and re-reading files.
Предварительные требования
- Node + bun or npm — brew install bun or use npm
- An embeddings provider (Ollama local, OpenAI, Gemini, or Groq) — ollama pull nomic-embed-text for offline
Поток
-
Build the context treeRun get_context_tree on the repo root. Summarize the top-level layers.✓ Скопировано→ AST tree with file headers
-
Skeleton-only readsUse get_file_skeleton on src/auth/ to see just signatures — don't read bodies yet.✓ Скопировано→ Function signatures without bodies
-
Ask a semantic questionsemantic_identifier_search: 'where is JWT verification implemented and called?'✓ Скопировано→ Ranked implementations + call sites
Итог: Agent operates with a mental model of the repo using ~5x less context.
Подводные камни
- First-run indexing is slow — Run the initial scan once; incremental updates are fast
- Embeddings model mismatch between index and query — Stick with one embedding model; re-index if you change it