/ 디렉터리 / 플레이그라운드 / data-engineering-skills
● 커뮤니티 AltimateAI ⚡ 바로 사용

data-engineering-skills

제작: AltimateAI · AltimateAI/data-engineering-skills

9 Claude Code skills for analytics engineering: 7 dbt workflows + 2 Snowflake query optimizers. 53% pass on real dbt tasks, 84% on Snowflake tuning.

Skills for the daily grind of analytics engineering. dbt skills cover creating, debugging, testing, documenting, migrating, refactoring, and incremental models. Snowflake skills find expensive queries and optimize either by text or by query_id. Philosophy: 'Read before you write. Build after you write. Verify your output.'

왜 쓰나요

핵심 기능

라이브 데모

실제 사용 모습

data-engineering-skill.replay ▶ 준비됨
0/0

설치

클라이언트 선택

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "data-engineering-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/AltimateAI/data-engineering-skills",
        "~/.claude/skills/data-engineering-skills"
      ],
      "_inferred": true
    }
  }
}

Claude Desktop → Settings → Developer → Edit Config 열기. 저장 후 앱 재시작.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "data-engineering-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/AltimateAI/data-engineering-skills",
        "~/.claude/skills/data-engineering-skills"
      ],
      "_inferred": true
    }
  }
}

Cursor는 Claude Desktop과 동일한 mcpServers 스키마 사용. 프로젝트 설정이 전역보다 우선.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "data-engineering-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/AltimateAI/data-engineering-skills",
        "~/.claude/skills/data-engineering-skills"
      ],
      "_inferred": true
    }
  }
}

Cline 사이드바의 MCP Servers 아이콘 클릭 후 "Edit Configuration" 선택.

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "data-engineering-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/AltimateAI/data-engineering-skills",
        "~/.claude/skills/data-engineering-skills"
      ],
      "_inferred": true
    }
  }
}

Claude Desktop과 같은 형식. Windsurf 재시작 후 적용.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "data-engineering-skill",
      "command": "git",
      "args": [
        "clone",
        "https://github.com/AltimateAI/data-engineering-skills",
        "~/.claude/skills/data-engineering-skills"
      ]
    }
  ]
}

Continue는 맵이 아닌 서버 오브젝트 배열 사용.

~/.config/zed/settings.json
{
  "context_servers": {
    "data-engineering-skill": {
      "command": {
        "path": "git",
        "args": [
          "clone",
          "https://github.com/AltimateAI/data-engineering-skills",
          "~/.claude/skills/data-engineering-skills"
        ]
      }
    }
  }
}

context_servers에 추가. 저장 시 Zed가 핫 리로드.

claude mcp add data-engineering-skill -- git clone https://github.com/AltimateAI/data-engineering-skills ~/.claude/skills/data-engineering-skills

한 줄 명령. claude mcp list로 확인, claude mcp remove로 제거.

사용 사례

실전 활용법: data-engineering-skills

Debug a failing dbt model without thrashing

👤 Analytics engineers facing a red CI run ⏱ ~20 min intermediate

언제 쓸까: dbt run just failed with a cryptic error and you don't know if it's schema, lineage, or SQL.

사전 조건
  • dbt project accessible — cd into your dbt repo so Claude can see models/
  • Skill installed — git clone https://github.com/AltimateAI/data-engineering-skills ~/.claude/skills/data-engineering-skills
흐름
  1. Feed Claude the error + model
    Use debugging-dbt-errors. Here's the stderr and models/marts/fct_orders.sql. Diagnose the root cause — don't guess.✓ 복사됨
    → Claude reads upstream refs, diagnoses in order: schema → lineage → SQL
  2. Apply the fix and verify
    Apply the fix and run dbt build --select fct_orders+. Show me the before/after row counts.✓ 복사됨
    → Clean run + row count verification

결과: Green CI plus a note of the root cause so it doesn't recur.

함정
  • Fixing a symptom downstream when the bug is upstream — The skill enforces an upstream-first diagnosis; don't skip the lineage step
함께 쓰기: bigquery-server · github

Find and fix your top expensive Snowflake queries

👤 Analytics leads with a climbing Snowflake bill ⏱ ~60 min intermediate

언제 쓸까: Finance flagged the Snowflake bill and you need to cut it without breaking dashboards.

사전 조건
  • Snowflake role with ACCOUNT_USAGE access — ACCOUNTADMIN typically, or a dedicated cost role
흐름
  1. Identify worst offenders
    Use finding-expensive-queries to list the top 20 queries in the past 30 days by credit cost. Group by app/user.✓ 복사됨
    → Ranked table with credits, runtime, warehouse
  2. Optimize each top one
    For the top offender, use optimizing-query-by-id <query_id>. Propose rewrites with estimated savings.✓ 복사됨
    → Rewritten SQL + before/after explain plan
  3. Validate and deploy
    Run the rewrite in a test warehouse — confirm same row count and shape before we swap.✓ 복사됨
    → Safe swap candidate

결과: A prioritized list of fixes with measurable $ savings.

함정
  • Rewrites change row count silently — Always diff before deploying — the skill enforces this
함께 쓰기: bigquery-server

Migrate a pile of stored procs into dbt models

👤 Teams moving off legacy SQL to dbt ⏱ ~90 min advanced

언제 쓸까: You've inherited a warehouse of nested CTEs and want them as documented, tested dbt models.

흐름
  1. Point the skill at the source SQL
    Use migrating-sql-to-dbt. Here's proc_monthly_revenue.sql. Convert it to dbt models with refs, documentation, and at least 2 tests per model.✓ 복사됨
    → One or more .sql files, schema.yml with docs and tests
  2. Build and verify
    dbt build the new models and compare row counts to the legacy output.✓ 복사됨
    → Row counts match within tolerance

결과: Legacy logic lives as testable dbt models.

함정
  • Hidden side effects in the proc (UPDATEs) — The skill flags side effects — separate them out, don't blindly convert
함께 쓰기: github

Convert a slow full-refresh model to incremental

👤 Analytics engineers with long-running dbt runs ⏱ ~45 min advanced

언제 쓸까: A daily model has grown too big for full refresh.

흐름
  1. Analyze the model
    Use developing-incremental-models on models/events.sql. Pick a strategy (merge / insert_overwrite / delete+insert) and justify.✓ 복사됨
    → Strategy + unique_key + partition / cluster keys recommended
  2. Implement and back-fill
    Apply the incremental config; outline a safe back-fill plan.✓ 복사됨
    → Model + back-fill steps

결과: Daily runs that finish in minutes, not hours.

함정
  • unique_key gets duplicates on late data — Use merge and test it

조합

다른 MCP와 조합해 10배 효율

data-engineering-skill + bigquery-server

Apply the same optimize-by-id pattern to BigQuery expensive queries

Adapt finding-expensive-queries for BigQuery INFORMATION_SCHEMA.JOBS and list top 20.✓ 복사됨
data-engineering-skill + github

Open a PR per migrated model so each is reviewable

For every migrated model, open a GitHub PR with dbt test output attached.✓ 복사됨

도구

이 MCP가 노출하는 것

도구입력언제 호출비용
creating-dbt-models model spec New model 0
debugging-dbt-errors error log, model CI or local run failed 0
testing-dbt-models model Untested model 0
documenting-dbt-models model Undocumented model 0
migrating-sql-to-dbt legacy SQL Legacy migration 0
refactoring-dbt-models model Hard-to-read model 0
developing-incremental-models full-refresh model Runtime too long 0
finding-expensive-queries lookback window Cost hunt ACCOUNT_USAGE query
optimizing-query-text SQL text Know the SQL, not the id 0
optimizing-query-by-id query_id Have the id from the UI 1 explain

비용 및 제한

운영 비용

API 쿼터
Snowflake queries cost credits like any other — ACCOUNT_USAGE reads are cheap
호출당 토큰
5–15k per dbt skill invocation
금액
Free skill
Run finding-expensive-queries once weekly, not on every session

보안

권한, 시크릿, 파급범위

최소 스코프: dbt: read + write to your project Snowflake: ACCOUNT_USAGE for cost skills
자격 증명 저장: dbt profiles.yml / Snowflake key-pair in env; the skill doesn't store secrets
데이터 외부 송신: None from the skill directly
절대 부여 금지: SYSADMIN to the Claude session unless absolutely needed

문제 해결

자주 발생하는 오류와 해결

dbt compile succeeds, run fails with column not found

Stale lineage — dbt deps + dbt clean + dbt build --select model+

finding-expensive-queries returns nothing

ACCOUNT_USAGE has ~45min delay; also confirm role has SNOWFLAKE.ACCOUNT_USAGE

확인: SHOW GRANTS TO ROLE <role>

대안

data-engineering-skills 다른 것과 비교

대안언제 쓰나단점/장점
dbt Cloud IDEYou prefer managed UI over terminalNo Claude in the loop
SQL query optimizers (Select.dev, etc.)You want visual query plansSeparate tool, separate context

더 보기

리소스

📖 GitHub에서 공식 README 읽기

🐙 열린 이슈 보기

🔍 400+ MCP 서버 및 Skills 전체 보기