Multi-Agent Patterns
Multi-Agent Architecture Patterns
Multi-agent architectures distribute work across multiple language model instances, each with its own Context window. Done well, they break past single-agent limitations. Done poorly, they just add coordination overhead. Here's the key insight: sub-agents' core value is Context isolation, not role-playing.
If you treat multi-agent as "role-playing," you'll probably end up with a more complex but not better system. The real value is: Context isolation + parallelization.
- Use multi-agent to isolate Context, not to role-play.
- Supervisor / swarm / hierarchical are the mainstream patterns.
- Token costs are high — only complex tasks justify the overhead.
- Avoid the telephone game; allow direct pass-through.
- Define explicit handoff and convergence rules.
What You'll Learn
- When you need multi-agent and when you don't
- Pros and cons of three architectural patterns
- How to design collaboration and convergence mechanisms
When to Activate
Activate this skill when:
- Single-agent Context limits constrain task complexity
- Tasks decompose naturally into parallel subtasks
- Different subtasks require different tool sets or system prompts
- Building systems that must handle multiple domains simultaneously
- Scaling agent capabilities beyond single-context limits
- Designing production agent systems with multiple specialized components
Core Concepts
Multi-agent systems solve single-agent limitations through Context distribution. Three mainstream patterns: supervisor/orchestrator, peer-to-peer/swarm, and hierarchical. The core design principle is Context isolation.
Effective multi-agent systems need explicit coordination protocols, consensus mechanisms that avoid sycophancy, and awareness of bottlenecks, divergence, and error propagation.
Detailed Topics
Why Multi-Agent Architectures
The Context Bottleneck A single agent hits ceilings in reasoning, Context management, and tool coordination. As task complexity grows, Context fills up with history, docs, and tool outputs, leading to lost-in-middle effects, attention scarcity, and Context poisoning.
Multi-agent splits tasks across multiple Context windows, reducing the load on any single Context.
The Token Economics Reality Multi-agent consumes significantly more tokens:
| Architecture | Token Multiplier | Use Case |
|---|---|---|
| Single agent chat | 1x baseline | Simple queries |
| Single agent with tools | ~4x baseline | Tool-using tasks |
| Multi-agent system | ~15x baseline | Complex research/coordination |
Research shows performance variance is driven primarily by token usage, tool calls, and model choice. Stronger models (like Claude Sonnet 4.5, GPT-5.2 thinking mode) tend to be more effective than just throwing more tokens at the problem.
The Parallelization Argument Many tasks can be split for parallel execution: multi-source retrieval, multi-document analysis, comparing different approaches. A single agent must handle these sequentially; multi-agent can run them in parallel, with total time approaching the longest subtask rather than the sum.
The Specialization Argument Different tasks need different system prompts and tool sets. Multi-agent allows specialization without burdening a single agent with every possible configuration.
Architectural Patterns
Pattern 1: Supervisor/Orchestrator A central supervisor controls flow, dispatches tasks, and aggregates results.
User Query -> Supervisor -> [Specialist, Specialist, Specialist] -> Aggregation -> Final Output
Good for: well-defined tasks, multi-domain coordination, human oversight requirements.
Strength: strong control.
Weakness: supervisor Context easily becomes a bottleneck; prone to the telephone game.
The Telephone Game Problem and Solution LangGraph benchmarks show supervisor architectures tend to lose detail.
The fix: let sub-agents pass responses directly through:
def forward_message(message: str, to_user: bool = True):
"""
Forward sub-agent response directly to user without supervisor synthesis.
"""
if to_user:
return {"type": "direct_response", "content": message}
return {"type": "supervisor_input", "content": message}
Pattern 2: Peer-to-Peer/Swarm No central control — agents hand off directly to each other.
def transfer_to_agent_b():
return agent_b
agent_a = Agent(
name="Agent A",
functions=[transfer_to_agent_b]
)
Good for: exploratory tasks, unstable requirements, elastic collaboration.
Strength: no single-point bottleneck.
Weakness: coordination is complex, tends to diverge.
Pattern 3: Hierarchical Multi-level decomposition: strategy / planning / execution.
Strategy Layer -> Planning Layer -> Execution Layer
Good for: large-scale projects, enterprise workflows, tasks requiring long-term planning.
Context Isolation as Design Principle
Context isolation is the core value of multi-agent. Each agent completes its subtask in a clean Context.
Isolation Mechanisms
- Full context delegation
- Instruction passing
- File system memory
The trade-offs depend on task complexity and latency requirements.
Consensus and Coordination
The Voting Problem Simple majority voting treats weak model hallucinations and strong model reasoning as equal weight.
Weighted Voting / Debate Protocols More reliable approaches use weighted voting or debate.
Trigger-Based Intervention Set up stall triggers and sycophancy triggers.
Practical Guidance
Failure Modes and Mitigations
- Supervisor bottleneck -> output schema + checkpointing
- Coordination overhead -> clear handoff + batching
- Divergence -> convergence checks + TTL
- Error propagation -> validate outputs + retry
Examples
Example 1: Research Team Architecture
Supervisor
├── Researcher
├── Analyzer
├── Fact-checker
└── Writer
Example 2: Handoff Protocol
def handle_customer_request(request):
if request.type == "billing":
return transfer_to(billing_agent)
elif request.type == "technical":
return transfer_to(technical_agent)
elif request.type == "sales":
return transfer_to(sales_agent)
else:
return handle_general(request)
Decision Helper: Do You Need Multi-Agent?
- Can the task be split into parallel subtasks?
- Is the single agent already hitting Context limits?
- Do subtasks need different tool sets or system prompts?
- Is the cost acceptable (tokens + latency)?
If 3 or more are "yes," then consider multi-agent.
Guidelines
- Design for Context isolation as the primary benefit
- Choose architecture based on coordination needs, not metaphor
- Implement explicit handoff protocols
- Use weighted voting or debate
- Monitor for supervisor bottlenecks
- Validate outputs before passing
- Set TTL limits
- Test failure scenarios
Practice Task
- Draw a multi-agent architecture diagram for your project
- Label each agent's Context boundaries and tool sets
Related Pages
Integration
This skill builds on context-fundamentals and context-degradation. It connects to:
- memory-systems
- tool-design
- context-optimization
References
External resources:
- LangGraph Documentation
- AutoGen Framework
- CrewAI Documentation
- Research on Multi-Agent Coordination
Skill Metadata
Created: 2025-12-20 Last Updated: 2025-12-20 Author: Agent Skills for Context Engineering Contributors Version: 1.0.0