Once your team gets past the first month of working with Claude, two things start happening at the same time. Useful work gets bigger. And the main Claude session that's doing the work starts to feel like one engineer trying to hold an entire project in their head. Refactors that touch fifteen files. Reviews that should run in parallel. Research that wants three competing hypotheses tested at once. A single context window starts to be the bottleneck, the same way a single engineer would be.
Anthropic's answer is delegation. Inside Claude Code, the platform has two distinct ways to spin up additional Claude instances and put them to work: sub-agents, which have been generally available since 2025, and Agent Teams, which launched in research preview in February 2026. Both let a Claude session offload work to other Claude instances. They are not the same thing. They use different coordination patterns, fit different kinds of work, and come with substantially different cost and complexity profiles.
This guide covers what sub-agents and Agent Teams actually are, the architectural difference that matters most (it's about topology, not file format), when to reach for which, and the governance patterns that keep delegation safe rather than chaotic. We also draw a line from these primitives to the AI-native team pattern HatchWorks has been writing about for two years, because Agent Teams is the first time the platform has given that pattern a name.
In this guide
- What sub-agents and Agent Teams actually are
- Sub-agents: hierarchical delegation
- Agent Teams: peer collaboration
- Sub-agents vs Agent Teams: the question that matters
- When to use which (and when to use neither)
- The patterns that keep delegation governed
- Why this looks a lot like a GenDD Pod
- HatchWorks AI
What sub-agents and Agent Teams actually are
A sub-agent is a separate Claude instance that the main session spawns, hands a focused task to, and gets a summary back from. The sub-agent runs in its own isolated context window, with its own system prompt and its own restricted set of tools, and never sees or talks to other sub-agents. It does the work, returns a result, and disappears.
An Agent Team is a group of Claude instances running in parallel, sharing a task list on disk, claiming work from that list as they finish previous tasks, and writing their outputs back to the shared state so other teammates can read them. There is no central orchestrator routing each task in real time. The shared file is the coordination layer.
Both are configured the same way at the file level. Each lives in .claude/agents/ as a markdown file with YAML frontmatter, defining a name, a description, a system prompt, and a permitted tool list. The file format is identical. The distinction between a sub-agent and a teammate is behavioral, not syntactic. The same code-reviewer.md can be invoked as a one-off sub-agent in one session and as a teammate inside an Agent Team in another. The difference is the topology you run it under.
A minimal agent file looks like this:
---
name: code-reviewer
description: Reviews recent code changes against our
conventions and security checklist. Use after
the user runs tests or finishes a feature.
tools: [Read, Grep, Glob, Bash(git diff:*)]
---
You are a senior code reviewer focused on this codebase.
When invoked, run git diff to see the recent changes,
then review them against the rules in references/style.md
and references/security.md.
Report findings in the following format:
- Critical (blocks merge)
- Important (should fix)
- Suggestion (optional)
Each finding includes the file, the line, and a one-sentence rationale.
The same file can be the top-level agent invoked by name ("use code-reviewer to audit the recent PR"), or it can be a sub-agent that an orchestrator spawns as one step in a larger pipeline, or it can be a teammate inside an Agent Team coordinating with other teammates on a sprint. The architecture decision sits one layer up from the file.
The architectural difference that matters
The single most important thing to understand about these two patterns is the topology. Sub-agents are hierarchical: one orchestrator at the top, sub-agents at the bottom, work flowing down and results flowing up, no lateral communication. Agent Teams are peer-to-peer: no orchestrator, teammates communicating through a shared file, work distributed by self-selection rather than assignment.
That topology decision drives everything else: when each pattern fits, how much it costs in tokens, how it fails when it fails, and how you keep it governed. The interactive view below makes the structural difference concrete.
Topology of delegation
Hierarchical vs peer: the difference that matters
Sub-agents and Agent Teams use different coordination patterns. Pick a topology to see the structure, how work flows, and what the trade-offs are.
1
Hierarchical (sub-agents)
Orchestrator-led, no lateral communication
2
Peer (Agent Teams)
Self-distributed, file-based coordination
Hierarchical
Orchestrator-led delegation
Sub-agents: hierarchical delegation
Sub-agents are the simpler primitive and the one Anthropic recommends starting with. They went generally available in 2025 and have a clean mental model: the main Claude session is the orchestrator. When the orchestrator decides a piece of work is worth delegating, it spawns a sub-agent with a defined system prompt, a specific tool list, and the input the sub-agent needs to do its job. The sub-agent runs in its own context, does the work, and returns a summary. The orchestrator never sees the sub-agent's intermediate steps, only the result.
Two reasons to use a sub-agent
Both of the legitimate reasons to delegate to a sub-agent come back to context management. The first is context isolation: there's a piece of work whose intermediate steps would clutter the main conversation, even though the final result is small. A code review that examines fifty files but returns three findings is the canonical case. The orchestrator doesn't need to see all fifty files; it only needs the three findings.
The second is scoped permission. Each sub-agent declares the tools it's allowed to use, and Anthropic's documentation explicitly recommends the minimum-permission pattern: a sub-agent that needs to read files but not write them gets only read tools; a sub-agent that needs to run a test suite gets only the test command, not arbitrary bash. This limits the blast radius if the sub-agent goes off-script. In practice, it also makes sub-agents more reliable, because narrower tool access produces narrower behavior.
How to invoke them
In principle, Claude is supposed to recognize when a task matches an available sub-agent's description and route to it automatically. In practice, auto-selection is unreliable. Claude frequently handles tasks in the main session even when a sub-agent's description matches the work cleanly. Anthropic has acknowledged this as a known gap. The only reliable way to invoke a sub-agent is explicit: "use code-reviewer to audit the recent PR."
This affects how you design sub-agent workflows. If you're counting on Claude to automatically reach for the right sub-agent at the right time, you'll be disappointed often enough to lose trust in the system. The practical pattern is to use sub-agents as named tools that the main session explicitly calls, rather than as autonomous specialists that wake up on their own. That makes sub-agents more like a library of well-defined functions than like a team of automatic delegates.
The over-delegation problem
The opposite failure mode is worth flagging too. Claude Opus 4.6 has a known tendency to over-delegate, spawning sub-agents for work where a direct approach in the main session would be faster and cheaper. Anthropic's own prompt-engineering guidance flags this. Sub-agents have meaningful overhead: each one is a full Claude session with its own startup cost and context window. Spawning one to answer a question the main session could have answered directly is pure waste.
The signal that you're over-delegating is usually obvious in retrospect: the sub-agent returns a result that the orchestrator has to do non-trivial work to use. If the orchestrator could have produced the same result in fewer turns by just doing the work, the sub-agent was the wrong call. Save sub-agents for work that's genuinely off-thread.
Agent Teams: peer collaboration
Agent Teams launched in research preview in February 2026. The mental model is genuinely different from sub-agents, and the differences are where most of the confusion lives. Where sub-agents are a hierarchy with one Claude at the top, Agent Teams is a flat group of Claude instances coordinating through shared state. There is no orchestrator routing each task in real time. The teammates self-distribute.
The shared task list is the coordination layer
The mechanism that makes Agent Teams work is simpler than it sounds. When a team starts, Claude generates an initial task list, typically as a file on disk. Each teammate reads the list, finds work that isn't yet claimed, marks it as in-progress, and starts. When the work is done, the teammate writes the result back to the file (or to an output specified by the task) and looks for the next available task. Dependencies are encoded in the task list itself: a task that depends on another won't be claimed until its prerequisite is marked complete.
The closest human analogy is a Kanban board with no project manager. The board is visible to everyone. Everyone self-assigns. The rules are simple enough that the system doesn't need active coordination: just write the rules clearly, and the work distributes itself.
This is the architectural opposite of sub-agents. In the sub-agent model, the orchestrator has full visibility and decides what each sub-agent works on next. In Agent Teams, no single agent has full visibility, but the file system does, and the file system is what holds the team together.
What it's actually good for
Anthropic's docs name four use cases as the strongest fits:
- Research and review. Several teammates investigate different aspects of a problem in parallel, then read each other's findings and challenge or build on them.
- New modules or features with independent scopes. Each teammate owns a piece (frontend, backend, tests, docs) without stepping on the others.
- Debugging with competing hypotheses. Two or three teammates test different theories in parallel, then converge on which one was right.
- Cross-layer coordination. Changes that span frontend, backend, and tests, each owned by a different teammate but visible to all of them through the shared state.
The thread running through all four is that teammates benefit from seeing each other's work. If they don't, you don't need an Agent Team; you need sub-agents, or you need a single session.
The cost is steep
Agent Teams uses roughly 15x the tokens of a single-agent session, per Anthropic's documentation, versus ~4-7x for sub-agents. The coordination overhead is real: teammates re-read parts of the shared state, communicate findings, occasionally redo work that another teammate already touched. The 15x multiplier is the price of peer coordination, and it's worth paying only when the work genuinely benefits from it. For sequential tasks, same-file edits, or work that's mostly independent and doesn't need cross-teammate visibility, a single session or sub-agents will be faster and cheaper.
How to enable it
Agent Teams is disabled by default. To turn it on, set the environment variable CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in your shell or in your project's .claude/settings.json. Once enabled, you describe the team you want in natural language and Claude creates the structure: "create an agent team to research these three competing migration strategies, with a teammate per strategy, and have them compare findings." Claude spawns the teammates, writes the initial task list, and gets out of the way.
Three teammates is a reasonable starting count for ~15 tasks worth of work, per the Anthropic guidance. Scale up only when there's clear evidence that adding teammates is reducing wall-clock time, not just multiplying token cost.
Sub-agents vs Agent Teams
Two delegation patterns, compared
Pick a card to see what it's for, then pick a dimension to compare both at once.
Pattern 1
Sub-agents
Hierarchical, orchestrator-led
A main Claude session delegates tasks to specialized sub-agents, each with its own context and tool list. Sub-agents never talk to each other.
Pattern 2
Agent Teams
Peer-to-peer, self-distributed
Multiple Claude instances share a task list on disk, claim work as they finish previous tasks, and reference each other's findings through file state.
Coordination
Token cost
Availability
Best fit
Common failure
When to use which (and when to use neither)
The question that comes up over and over with multi-agent setups is "should this be a single session, sub-agents, or an Agent Team?" The right answer is usually clearer than people expect, but only if you separate two questions that often get conflated: does the work parallelize, and do the workers need to see each other.
Start with: does this even need to be multi-agent?
Most work doesn't. A single Claude session has roughly the same effective horsepower as one focused engineer with good context. The cases where multi-agent setups beat a single session are real but specific: the work has parts that genuinely run independently, or the work has steps whose intermediate output would clutter the main thread.
If neither of those applies, stay in a single session. The token cost of multi-agent setups is non-trivial (4-7x for sub-agents, 15x for Agent Teams), and the failure modes are real. Anthropic's own guidance recommends starting with the simpler primitive and upgrading only when there's a specific reason. That's not conservatism, it's good architecture.
If you do need multi-agent, decide on topology
Once you've decided multi-agent is the right call, the question becomes hierarchical or peer. The cleanest discriminator is whether workers need to see each other's output during the work, not just after.
- No, workers are independent. Each worker has a self-contained job. The orchestrator collects their outputs at the end and synthesizes. This is sub-agents.
- Yes, workers benefit from each other's findings. One worker's discovery might change another worker's approach. Workers need to read each other's progress while they're still working. This is Agent Teams.
Examples make this concrete. A code review against fifty files: each file can be reviewed independently. The findings get rolled up at the end. Sub-agents fit. A research project comparing three migration strategies: each researcher might surface a finding that affects how the other researchers frame their analyses. Agent Teams fit.
The hybrid pattern is real
In practice, the most useful multi-agent setups combine both. An orchestrator at the top decomposes the work, then spawns an Agent Team for the phase of the project where peer coordination matters, then takes the team's synthesized output and continues. This is the pattern Anthropic's docs describe for complex workflows: sub-agent control at the top level, Agent Team coordination for the parallel parts inside.
The decision helper below maps the most common shapes of work to the right delegation pattern.
Should you delegate?
Three questions, one recommendation
Single session, sub-agents, Agent Team, or hybrid. The answer depends on the shape of the work.
Question 1 of 3
Are there parts of this work that can genuinely run in parallel?
Yes — multiple distinct pieces can be worked on at the same time
No — the work is mostly sequential or single-threaded
Question 2 of 3
Do the workers need to see each other's findings during the work?
Yes — one worker's discovery might change another worker's approach
No — each worker is fully independent; orchestrator can synthesize at the end
Question 3 of 3
Does the project have distinct phases (e.g., planning, parallel execution, integration)?
Yes — some parts benefit from coordination, others can run flat
No — the whole thing has a single shape
Answer all three questions above
Your recommendation will appear here
The choice depends on whether the work parallelizes, whether workers benefit from seeing each other, and whether the project has distinct phases.
Start over
The patterns that keep delegation governed
Multi-agent setups expand what Claude can do, but they also expand the surface area where things can go wrong. The patterns below are what separate delegation that works in production from delegation that creates more problems than it solves. None of these are speculative; they're what experienced teams using sub-agents and Agent Teams have converged on, and they apply to both patterns.
Minimum-permission tools
Every sub-agent and teammate file declares its tool list. Default to the smallest set that lets the work happen. A code reviewer needs read access and the ability to run git diff. It does not need write access, shell, or web fetch. A test runner needs the test command. It does not need the ability to modify source files. The discipline of stating exactly what each role can do, in a file in the repo, is what makes multi-agent setups auditable.
The auxiliary benefit is reliability. A sub-agent with fewer tools makes fewer decisions, and fewer decisions means more predictable output. The instinct to over-grant permissions "just in case" is the same instinct that creates ambiguous junior engineers. Be specific about the role.
Validation gates that humans (or scripts) run
Delegation works when there's a clear gate at the end where the result gets checked before it's accepted. In sub-agent setups, the orchestrator does the checking by design (it sees the result, decides whether to act on it, can ask for revisions). In Agent Teams, the gate has to be explicit, because no single agent sees the whole picture: a final synthesis step, a dedicated validator teammate, or a human review pass before the team's output gets merged.
The pattern that works best is "validator separate from worker." A worker teammate that's also checking its own work has a quality ceiling. A separate validator (a Claude instance, a script, or a human) catches what the worker missed. Skills that include validation scripts (per the Skills article in this series) are exactly the right tool for this; the validator teammate loads the relevant Skill and runs its checks.
Structured output that's easy to consume
Sub-agents and teammates that return free-form prose are harder to use than ones that return structured output. A code reviewer that returns "the auth module looks fine but I noticed some potential issues in the API layer" requires the orchestrator to do interpretation work. A code reviewer that returns a JSON list of findings, each with file, line, severity, and rationale, is mechanically consumable.
Define the output format in the sub-agent's system prompt and stick to it. Both Anthropic's docs and the practitioner write-ups converge on this: structured output is one of the highest-leverage things you can do for delegation reliability.
Obstacle reporting
Sub-agents that get stuck have two reasonable behaviors. They can spend turns trying to work around the obstacle, burning tokens and possibly making things worse. Or they can return with a structured "I'm blocked, here's why" message that lets the orchestrator decide what to do.
The second behavior is what you want, and it has to be designed for. Tell the sub-agent in its system prompt: if you encounter an obstacle that prevents you from completing the task, stop and report it; don't attempt fixes outside your declared scope. This is how you keep a sub-agent that's supposed to review code from accidentally trying to rewrite it.
Three common delegation patterns
There are three patterns that show up over and over in multi-agent work, and they map cleanly to different shapes of project. The visual below names them and shows where each one fits best.
Three delegation patterns
Map-Reduce, Scrum, and the hybrid
Three structural patterns show up over and over. Pick one to see its shape, where it shines, and where it doesn't.
Pattern A
Map-Reduce
Sub-agents, independent work, results rolled up
Pattern B
Scrum
Agent Team, peer coordination, shared task list
Pattern C
Hybrid
Orchestrator at top, Agent Team inside
Best when
Avoid when
Concrete example
Why this looks a lot like a GenDD Pod
If you've been reading about AI-native engineering practice, the patterns above will feel familiar. They map almost one-to-one onto a structure HatchWorks has been working with for two years under the name Generative-Driven Development. We call the unit of coordinated AI work a GenDD Pod: a small group of AI workers operating under explicit human oversight, with defined roles, scoped permissions, and structured handoffs. The shape is identical to what Agent Teams now describes at the platform level.
A few specific places where the bridge shows up:
- Roles before tools. GenDD Pods define each worker's role first, then assign the minimum tool set that role needs. Sub-agent and teammate files do exactly this with the tools: declaration. Naming the role and scoping its permissions is the same discipline either way.
- Shared context as a file, not a meeting. GenDD Pods coordinate through artifacts in the repo: Context Packs, status files, decision logs. Agent Teams' shared task list is the same pattern. The file system is more reliable than ad-hoc coordination, for AI workers and humans alike.
- Human checkpoints at the boundaries. The Three-Tier Human/AI Boundary Model that HatchWorks uses to design Pods names exactly where humans review work and where AI runs autonomously. Multi-agent setups need the same map. Without explicit validation gates, delegation degrades into things-happening-on-their-own, which is not what you want at production scale.
- Repeatable structure over one-off coordination. Both Pods and Agent Teams gain their leverage from being reusable patterns, not bespoke setups. The pattern that works on this project should work on the next one with minor adjustments. That's how methodology scales beyond the people who invented it.
What's new in 2026 is that the platform now provides the primitives. Anthropic has done the hard part of making peer coordination work between Claude instances. The work that remains is the methodology layer: who decides what gets delegated, what governance applies, how outputs get validated, where the human stays in the loop. That layer doesn't come from the platform. It comes from how your team chooses to use the platform, and the structure you build around delegation before you trust it with real work.
The headline question, for teams thinking about how to roll out sub-agents and Agent Teams safely, is not which delegation pattern should we use. It's what's our governance pattern, and how does delegation fit inside it. The platform answers the first question. You still have to answer the second.
The platform answers the question of how to delegate. You still have to answer the question of what your governance pattern is, and how delegation fits inside it.
HatchWorks AI
Roll out multi-agent setups without losing the plot
HatchWorks AI helps engineering organizations design, deploy, and govern multi-agent Claude workflows as part of a broader AI-native methodology practice. Generative-Driven Development is the methodology layer; sub-agents, Agent Teams, and Skills are the artifacts it produces. If you're trying to figure out which delegation pattern fits your team, how to design GenDD Pods that work safely in production, or how to govern AI delegation across an engineering organization, we can help.