Find the skill pulling your agent's performance down
Когда использовать: You feel the agent has gotten worse, not better, as you added skills.
Предварительные требования
- Node 20+ — nvm install 20
- Skill cloned and installed — git clone https://github.com/Evol-ai/SkillCompass ~/.claude/skills/SkillCompass; npm i
Поток
-
Run the evaluatorScore all skills in ~/.claude/skills/ — show me the weakest link.✓ Скопировано→ Ranked skill list with per-dimension scores
-
Diagnose the loserFor the weakest skill, what specifically is wrong?✓ Скопировано→ Concrete critique (vague description, conflicting with other skill, etc.)
-
Propose a fixSuggest a minimal edit to SKILL.md to fix it.✓ Скопировано→ Small, reviewable diff
-
Re-evaluateRe-run the eval and show before/after.✓ Скопировано→ Metrics improved, with evidence
Итог: A measurably better skill bundle, with a reproducible eval process.
Подводные камни
- Gaming the eval metric instead of helping real tasks — Include task-level downstream metrics (actual agent outcomes), not just text-level