Find the skill pulling your agent's performance down
언제 쓸까: You feel the agent has gotten worse, not better, as you added skills.
사전 조건
- Node 20+ — nvm install 20
- Skill cloned and installed — git clone https://github.com/Evol-ai/SkillCompass ~/.claude/skills/SkillCompass; npm i
흐름
-
Run the evaluatorScore all skills in ~/.claude/skills/ — show me the weakest link.✓ 복사됨→ Ranked skill list with per-dimension scores
-
Diagnose the loserFor the weakest skill, what specifically is wrong?✓ 복사됨→ Concrete critique (vague description, conflicting with other skill, etc.)
-
Propose a fixSuggest a minimal edit to SKILL.md to fix it.✓ 복사됨→ Small, reviewable diff
-
Re-evaluateRe-run the eval and show before/after.✓ 복사됨→ Metrics improved, with evidence
결과: A measurably better skill bundle, with a reproducible eval process.
함정
- Gaming the eval metric instead of helping real tasks — Include task-level downstream metrics (actual agent outcomes), not just text-level