Dynamic Workflows in Claude Code: How 1,000 Parallel Subagents Ported Bun From Zig to Rust in 11 Days
Anthropic shipped Dynamic Workflows in research preview on 2026-05-28 alongside Opus 4.8 — and within a week, Bun's Jarred Sumner used it to translate ~1 million lines of Zig into ~750,000 lines of Rust with 99.8% of the test suite still green. This is an honest technical walkthrough of what Dynamic Workflows actually does, why the Bun port is the right benchmark and the wrong one, the costs you'll see on your bill, and where the limits bite in practice.
Published 2026-06-01
TL;DR
- What: Dynamic Workflows is a new Claude Code primitive (research preview, Max/Team/Enterprise) where Claude writes a JavaScript orchestration script that spawns subagents in parallel, runs them in a background runtime, and only surfaces the converged result to your session.
- Scale: Up to 16 concurrent subagents and 1,000 total per run. Runs can extend across hours or days, with checkpointing so an interrupted job resumes.
- Killer demo: Bun ported from Zig to Rust — ~1M LOC source, ~750k LOC Rust output, 99.8% test suite green, 11 days end-to-end. Production codebase, not a toy.
- Where it wins: Codebase-scale migrations, fan-out research, anything where the bottleneck is human serialization of independent subtasks.
- Where it doesn't: Tasks with hard sequential dependencies, work where verification is fuzzy, or projects without a strong test suite as the safety net.
- Cost reality: Your MCP server bills, CI minutes, and API spend all multiply by the fan-out factor. Budget *before* enabling it on a real task.
What Dynamic Workflows actually is
Strip the marketing: a Dynamic Workflow is a JavaScript program that Claude writes for your task. The program contains the loop, the branching, the partial results, and the verification logic. Claude executes the program in a runtime that lives outside your interactive session, spawning subagents as the script calls for them.
The architectural insight is what's new. In previous agentic patterns, the orchestrator *was the agent* — Claude held the plan, the intermediate state, and the verification all in its own context. That model breaks down somewhere around 100k–200k tokens of accumulated state. Dynamic Workflows pushes that state into a script's variables and a checkpoint store; Claude's context only ever holds the current sub-task and the final answer.
The execution loop, end-to-end
- Plan. You describe the goal. Claude analyzes your repo + the prompt and writes the workflow script. This is a one-shot Claude call — no subagents yet.
- Fan out. The script starts. It spawns the first batch of subagents. Each gets its own scoped instructions, its own tool list, its own memory.
- Adversarial check. This is the part nobody mentions. Other agents try to refute what the first batch found — they argue the opposite, they hunt for failure modes. The run keeps iterating until the agents converge instead of disagreeing.
- Verify. Outputs hit the script's verification gates — typically your existing test suite, but you can wire any callable check.
- Checkpoint. State is saved continuously. If the run dies (network drop, manual interrupt, runtime restart), it resumes from the last checkpoint, not from zero.
- Report. Only after verification passes does the converged answer reach your session.
The adversarial-checking step (#3) is the differentiator from older parallel-agent designs. Naive parallel runs give you N independent answers and force you to pick. Dynamic Workflows runs arguments between the agents until either they converge or the script detects unresolvable disagreement and surfaces it.
The hard limits: 16 × 1000
| Limit | Value | Why it matters |
|---|---|---|
| Concurrent subagents | 16 | This is your fan-out width per step. Most work is bottlenecked here, not by the 1000 cap. |
| Total subagents per run | 1000 | Hard ceiling on the entire job. The Bun port reportedly used ~600 over 11 days. |
| Wall-clock | Effectively unbounded — hours to days | Workflows are designed to be long-running and resumable. |
| Checkpointing | Continuous | Interrupted runs pick up where they left off. |
| Plans tier | Max / Team / Enterprise (admin-enabled) | Not available on Pro or free tiers in research preview. |
The Bun port: what actually happened
Bun's Jarred Sumner ran the largest publicly-disclosed Dynamic Workflows job to date. The goal was a clean translation of Bun's Zig codebase (~1M LOC) into Rust, structured as a single PR (#30412), without rewriting features. The pipeline he composed:
- Workflow #1: Map the right Rust lifetime for every struct field in the Zig codebase. (This is the kind of analysis that requires reading thousands of files and reasoning about ownership transitively. Trivially parallel; nightmare for a single agent's context.)
- Workflow #2: Write every
.rsfile as a behavior-identical port of its.zigcounterpart. Hundreds of agents in parallel, two reviewers on each file. - Workflow #3: A fix loop that drove the build and test suite until both ran clean.
- Workflow #4 (overnight): Addressed unnecessary data copies and opened a PR for each, with a final human review gate.
Outcome: ~750k LOC of Rust, 99.8% of the existing test suite passing, 11 days from first commit to merge. The Register confirmed the PR merge; Sumner has been open about the workflow shape on X.
Where Dynamic Workflows actually wins
- Codebase-scale migrations with a strong test suite as the verification gate: framework upgrades (React 18→19, Next.js 14→16), language ports, deprecated-API replacements, type-system widenings.
- Fan-out research: 'For each of these 200 customers, summarize their last 90 days of tickets and flag risk signals.' Embarrassingly parallel + lots of independent intermediate state.
- Catalog/asset generation: Writing 500 lint rules, 200 prompt variants, 100 i18n translations — anything that benefits from independent attempts plus an adversarial reviewer.
- Long-horizon refactors that don't fit one session: the checkpoint+resume property changes the math on jobs that previously had to be split into PRs.
Where it visibly doesn't win (yet)
- Sequentially dependent work with little parallel surface — e.g., writing a single coherent design doc, single-file deep debugging, anything where step N strictly requires step N-1's answer.
- Tasks without a verification function. If your check at step #4 is 'looks good to me', Dynamic Workflows degrades to a very expensive single-agent run. The script needs a real gate.
- Repos with poor test coverage. The test suite *is* the safety net. Without one, the fix loop has nothing to converge against and the workflow will happily ship plausibly-broken code.
- Bills you haven't budgeted for. A 600-agent run with 16x concurrency is not a $5 query. See next section.
What this actually costs (and what spikes your bill)
Pricing remains the standard Opus 4.8 rate per subagent. The cost surprise isn't the per-call cost — it's the multiplier. A naive 1-hour session uses one Claude context worth of tokens. A 600-agent dynamic workflow uses ~600 of them, often with overlapping context replication.
- Token spend scales roughly with
(# subagents) × (average context per subagent). For migration-shaped work that's typically 10-50× a normal interactive session. - MCP server bills multiply by the fan-out factor too. If you have a paid MCP (Stripe MCP, Sentry MCP, Brave Search MCP), a 100-agent run = 100× the calls. Audit your rate limits before enabling.
- CI minutes go up — every fix loop iteration runs your test suite. A long-running workflow may run CI hundreds of times.
- Fast mode at $10/$50 per 1M tokens (Opus 4.8) is the right default for most subagent work. Reach for standard or
xhigheffort only on the orchestrator and the verifier.
What this means for the MCP ecosystem
Three concrete shifts:
- Rate-limiting in MCP servers stops being optional. Before Dynamic Workflows, a single Claude Code session called your server ~1 request at a time. Now 16 concurrent subagents can hammer it for an hour. Server authors: add backoff, retries, and per-tenant quotas. Yesterday.
- Idempotency matters. Subagents will sometimes retry. If your MCP's 'create issue' tool isn't idempotent, you'll get 16 duplicate tickets per fix-loop iteration.
- Errors propagate, but only with Opus 4.8. Opus 4.8's honesty improvements mean failed MCP calls surface as real failures instead of being papered over. Pair Dynamic Workflows with anything older and you'll see plausibly-wrong outputs that took 600× the budget to produce.
Compared to existing patterns
| Approach | Concurrency | State management | Best for |
|---|---|---|---|
| Single Claude Code session | 1 | All in agent context | Most interactive coding, debugging, design work. |
| Claude Code Tasks (sub-tasks) | Sequential | Agent context + task tree | Multi-step tasks within one session. |
| Manual fan-out (you script it) | Whatever you write | You manage it | When you already know the parallel shape and want full control. |
| Dynamic Workflows | 16 concurrent / 1000 total | Externalized script + checkpointer | Codebase-scale, long-running, parallel-shaped work with a verification gate. |
| LangGraph / CrewAI style frameworks | Whatever you wire | Manual + framework primitives | Custom multi-agent topologies, non-Claude clients, research. |
What to actually do with it this week
- Read Anthropic's docs at code.claude.com/docs/en/workflows. The doc has the runtime semantics that the blog post elides.
- Pick a real but bounded task with a test suite. Good candidates: a deprecated-API migration, a config-format upgrade, a docs-rewrite across 50+ files.
- Start with a 5-subagent cap. Trust nothing until you see your own numbers.
- Pin your MCP versions. Dynamic Workflows + auto-updating MCPs is the worst-case scenario for the supply chain risks discussed in Nx Console MCP Token Theft.
- Watch your test suite's flakiness carefully — Dynamic Workflows surfaces every flake by running tests dozens of times. This is a feature; act on what it reveals.
Bottom line
Dynamic Workflows is the first agentic primitive that maps cleanly onto a class of engineering problems teams genuinely have — codebase-scale migrations and fan-out research. The Bun port is a real proof point; it is also a best-case shape (parallel + clean tests + Sumner driving). The defensive moves are pragmatic: pin versions, scope tokens, budget the fan-out before you commit. The strategic move is to start identifying the work in your stack that *was* too big and is now sized for one engineer plus a workflow.