logo
18

Multi-Agent Patterns

⏱️ 35 min

Multi-Agent Architecture Patterns

Multi-agent architectures distribute work across multiple language model instances, each with its own Context window. Done well, they break past single-agent limitations. Done poorly, they just add coordination overhead. Here's the key insight: sub-agents' core value is Context isolation, not role-playing.

If you treat multi-agent as "role-playing," you'll probably end up with a more complex but not better system. The real value is: Context isolation + parallelization.

  • Use multi-agent to isolate Context, not to role-play.
  • Supervisor / swarm / hierarchical are the mainstream patterns.
  • Token costs are high — only complex tasks justify the overhead.
  • Avoid the telephone game; allow direct pass-through.
  • Define explicit handoff and convergence rules.

What You'll Learn

  • When you need multi-agent and when you don't
  • Pros and cons of three architectural patterns
  • How to design collaboration and convergence mechanisms

When to Activate

Activate this skill when:

  • Single-agent Context limits constrain task complexity
  • Tasks decompose naturally into parallel subtasks
  • Different subtasks require different tool sets or system prompts
  • Building systems that must handle multiple domains simultaneously
  • Scaling agent capabilities beyond single-context limits
  • Designing production agent systems with multiple specialized components

Core Concepts

Multi-agent systems solve single-agent limitations through Context distribution. Three mainstream patterns: supervisor/orchestrator, peer-to-peer/swarm, and hierarchical. The core design principle is Context isolation.

Effective multi-agent systems need explicit coordination protocols, consensus mechanisms that avoid sycophancy, and awareness of bottlenecks, divergence, and error propagation.

Detailed Topics

Why Multi-Agent Architectures

The Context Bottleneck A single agent hits ceilings in reasoning, Context management, and tool coordination. As task complexity grows, Context fills up with history, docs, and tool outputs, leading to lost-in-middle effects, attention scarcity, and Context poisoning.

Multi-agent splits tasks across multiple Context windows, reducing the load on any single Context.

The Token Economics Reality Multi-agent consumes significantly more tokens:

ArchitectureToken MultiplierUse Case
Single agent chat1x baselineSimple queries
Single agent with tools~4x baselineTool-using tasks
Multi-agent system~15x baselineComplex research/coordination

Research shows performance variance is driven primarily by token usage, tool calls, and model choice. Stronger models (like Claude Sonnet 4.5, GPT-5.2 thinking mode) tend to be more effective than just throwing more tokens at the problem.

The Parallelization Argument Many tasks can be split for parallel execution: multi-source retrieval, multi-document analysis, comparing different approaches. A single agent must handle these sequentially; multi-agent can run them in parallel, with total time approaching the longest subtask rather than the sum.

The Specialization Argument Different tasks need different system prompts and tool sets. Multi-agent allows specialization without burdening a single agent with every possible configuration.

Architectural Patterns

Pattern 1: Supervisor/Orchestrator A central supervisor controls flow, dispatches tasks, and aggregates results.

User Query -> Supervisor -> [Specialist, Specialist, Specialist] -> Aggregation -> Final Output

Good for: well-defined tasks, multi-domain coordination, human oversight requirements.

Strength: strong control.

Weakness: supervisor Context easily becomes a bottleneck; prone to the telephone game.

The Telephone Game Problem and Solution LangGraph benchmarks show supervisor architectures tend to lose detail.

The fix: let sub-agents pass responses directly through:

def forward_message(message: str, to_user: bool = True):
    """
    Forward sub-agent response directly to user without supervisor synthesis.
    """
    if to_user:
        return {"type": "direct_response", "content": message}
    return {"type": "supervisor_input", "content": message}

Pattern 2: Peer-to-Peer/Swarm No central control — agents hand off directly to each other.

def transfer_to_agent_b():
    return agent_b

agent_a = Agent(
    name="Agent A",
    functions=[transfer_to_agent_b]
)

Good for: exploratory tasks, unstable requirements, elastic collaboration.

Strength: no single-point bottleneck.

Weakness: coordination is complex, tends to diverge.

Pattern 3: Hierarchical Multi-level decomposition: strategy / planning / execution.

Strategy Layer -> Planning Layer -> Execution Layer

Good for: large-scale projects, enterprise workflows, tasks requiring long-term planning.

Context Isolation as Design Principle

Context isolation is the core value of multi-agent. Each agent completes its subtask in a clean Context.

Isolation Mechanisms

  • Full context delegation
  • Instruction passing
  • File system memory

The trade-offs depend on task complexity and latency requirements.

Consensus and Coordination

The Voting Problem Simple majority voting treats weak model hallucinations and strong model reasoning as equal weight.

Weighted Voting / Debate Protocols More reliable approaches use weighted voting or debate.

Trigger-Based Intervention Set up stall triggers and sycophancy triggers.

Practical Guidance

Failure Modes and Mitigations

  • Supervisor bottleneck -> output schema + checkpointing
  • Coordination overhead -> clear handoff + batching
  • Divergence -> convergence checks + TTL
  • Error propagation -> validate outputs + retry

Examples

Example 1: Research Team Architecture

Supervisor
├── Researcher
├── Analyzer
├── Fact-checker
└── Writer

Example 2: Handoff Protocol

def handle_customer_request(request):
    if request.type == "billing":
        return transfer_to(billing_agent)
    elif request.type == "technical":
        return transfer_to(technical_agent)
    elif request.type == "sales":
        return transfer_to(sales_agent)
    else:
        return handle_general(request)

Decision Helper: Do You Need Multi-Agent?

  • Can the task be split into parallel subtasks?
  • Is the single agent already hitting Context limits?
  • Do subtasks need different tool sets or system prompts?
  • Is the cost acceptable (tokens + latency)?

If 3 or more are "yes," then consider multi-agent.

Guidelines

  1. Design for Context isolation as the primary benefit
  2. Choose architecture based on coordination needs, not metaphor
  3. Implement explicit handoff protocols
  4. Use weighted voting or debate
  5. Monitor for supervisor bottlenecks
  6. Validate outputs before passing
  7. Set TTL limits
  8. Test failure scenarios

Practice Task

  • Draw a multi-agent architecture diagram for your project
  • Label each agent's Context boundaries and tool sets

Integration

This skill builds on context-fundamentals and context-degradation. It connects to:

  • memory-systems
  • tool-design
  • context-optimization

References

External resources:

  • LangGraph Documentation
  • AutoGen Framework
  • CrewAI Documentation
  • Research on Multi-Agent Coordination

Skill Metadata

Created: 2025-12-20 Last Updated: 2025-12-20 Author: Agent Skills for Context Engineering Contributors Version: 1.0.0