Multi-Agent Context Isolation
Context Isolation in Multi-Agent Systems
Chapter 6 solved "how does a single agent run a long task without blowing up context". Production hits another scale — a main agent dispatching sub-agents. The choice between shared, isolated, or partially shared context decides whether the system can scale.
Three Multi-Agent Context Topologies
1. Shared Context — One LLM Runs the Whole Thing
Every step runs in the same messages array.
messages = [
user: "调研 Kubernetes 三个最流行的 ingress controller 各自优劣"
assistant: 我先列三个候选 → ingress-nginx, traefik, contour
tool_use: WebFetch ingress-nginx docs
tool_result: ...
tool_use: WebFetch traefik docs
tool_result: ...
... (累积 50K context)
assistant: 综合分析后, ingress-nginx 优在生态成熟, traefik 优在...
]
Pro: easiest to implement (no state management); zero information loss.
Con: context grows linearly, the Lost in the Middle problem bites around step 30; every step pays the accumulated token cost.
Fits: under 10 steps with strong dependencies between steps.
2. Isolated Context — Sub-Agents Fully Independent
The sub-agent gets a fresh context — just task description + needed reference material. After running, it returns a final answer + short summary, not the process.
主 agent context:
user: "调研 K8s 三个 ingress controller"
assistant: 我会派 3 个 sub-agent 并行调研
tool_use: spawn_subagent(name="research_ingress_nginx", task="...")
tool_result: { summary: "ingress-nginx 优在生态成熟...", refs: [...] }
tool_use: spawn_subagent(name="research_traefik", task="...")
tool_result: { summary: "traefik 优在配置灵活...", refs: [...] }
...
assistant: 综合三个 sub-agent 的总结...
# 每个 sub-agent 自己的 context(独立):
user: "调研 ingress-nginx:架构、性能、生态、坑"
tool_use: WebFetch ingress-nginx docs
... (自己累积 30K context, 完成后丢弃)
assistant: 给主 agent 一段 500 字摘要
Pro: main agent context stays small; sub-agents can run in parallel; one blowing up doesn't affect others.
Con: lossy handoff; main can't see intermediate reasoning, has to trust the result; needs spawn / coordinate infrastructure.
Fits: sub-tasks independent and parallelizable; main agent needs to scale past 30 steps.
3. Partial Sharing — Summary Plus Key Raw Data
Sub-agents return a summary, plus the key supporting evidence verbatim. Main agent gets summary + a few key quotes and can verify the sub-agent.
Anthropic's Multi-agent Research System blog pattern — main is Lead Researcher, sub-agents are Search Subagents. Handoff includes source URL + key quotes for verification.
Cost: handoff payload 5-10× larger than pure summary, but trust goes way up.
Anthropic Multi-Agent Research System — Official Case
Anthropic's 2025-04 blog describes the multi-agent research feature they built into Claude.ai:
Architecture:
LeadResearcher (主)
└─ 拆解用户 query → 决定派几个 sub-agent
└─ 派 SearchSubagent 1: "找 X 主题的最新 paper"
└─ 派 SearchSubagent 2: "找 Y 公司的官方文档"
└─ 派 SearchSubagent 3: "找用户评测和实际案例"
└─ 收集所有 sub-agent 摘要 → 综合写最终 report
Anthropic's engineering takeaways (direct quotes):
- "Sub-agent context can't be shared — parallel sub-agents not knowing what others do actually avoids duplicate work and group think"
- "Optimal sub-agent count is 3-5; past 5, lead coordination cost outweighs the gain"
- "Sub-agents handing raw search results to lead is wrong — they have to distill into findings first"
- "The whole system uses 4× more tokens than a single-agent baseline, but the quality lift far outweighs the token cost"
Claude Code's Agent Tool — Sub-Agent in Production
Claude Code's built-in Agent tool is the sub-agent pattern (this course was very likely written by Claude Code — main agent dispatches an Explore subagent to look up prompt-master configs. The main agent doesn't read the 200-line config raw, just the structured summary returned).
Config:
- subagent_type — different sub-agents get different tool sets (Explore can only read, code-architect can design but not edit)
- isolation: "worktree" — high-blast-radius tasks dispatch to a sub-agent in a separate git worktree, then merge or discard
- Description and Prompt — sub-agent only sees its prompt + own tool list at startup
"Isolated context + lossy handoff" productized — main delegates, protects its own context.
JR Real Case: Isolation Across omni-report's 17 Routines
JR Academy's omni-report runs 17 independent routines (AI Visibility / Competitor Weekly / Marketing Topics / Growth Playbook / Daily Jobs ×4 / various daily reports). Each is its own cron job — independent context, git commit, Notion sync.
Why not merge into a single "omni-report master agent"?
- Context isolation — Competitor Weekly's 50K context shouldn't pollute Daily Jobs
- Failure isolation — one crashing doesn't stop the other 16
- Observability — each monitored separately
- Re-runnable — any can be triggered alone (the 13th routine's idle-timeout retry only affected that one)
Cross-routine flow: async via git commits. Marketing Topics runs Monday and writes to marketing-topics/$DATE.md. Growth Playbook runs Tuesday and Reads last week's report into its own context. "Filesystem as sub-agent handoff channel" — no message queue, git is enough.
Shared / Isolated / Partial — Trade-off
| Dimension | Shared context | Isolated + summary | Partial sharing |
|---|---|---|---|
| Main agent context growth | Linear, painful past step 30 | Almost flat | Slow growth (summary + a little raw) |
| Information loss | 0 | Large (only summary) | Medium (summary + key citations) |
| Sub-agent parallelism | No (serial only) | Perfect parallel | Perfect parallel |
| Infrastructure | 0 (one LLM) | spawn / coordinate / summarizer | + citation extraction |
| Token cost | Medium (linear growth) | High (multiple LLMs) | Higher (summary + citations) |
| Trust (can main verify sub) | Perfect | Weak | Medium (can see raw evidence) |
| Fits | < 10 steps, strong dependency | 30+ steps, parallelizable | Serious reasoning, traceability needed |
JR's internal rule: under 10 steps single LLM. 10-30 partial sharing. 30+ isolated sub-agent. The curve Anthropic verified building their own research system.
Takeaway
Multi-agent isn't "split as fine as possible". Optimal sub-agent count is 3-5 — past 5, coordination cost eats the gain. Sub-agents have to distill findings before handing back; raw data still blows up the main agent. Anthropic's own research system uses 4× tokens but the quality lift far outweighs cost — proving "more tokens but layered" beats "fewer tokens but tangled".
References
- Anthropic. (2025-04-15). How we built our multi-agent research system — Lead + Search subagent architecture and the 3-5 sub-agent rule of thumb.
- Anthropic. Claude Code documentation — Agent tool — sub-agent type / isolation / handoff implementation.
- Anthropic. (2024-12-20). Building Effective Agents — original orchestrator-workers pattern.
- LangGraph. Multi-agent supervisor pattern — open-source equivalent.
- AutoGen. GitHub — Microsoft's multi-agent framework, comparative reference.
Production case: JR Academy omni-report — 17 independent routines using git commits for async cross-agent handoff, filesystem as sub-agent communication channel.
📚 相关资源
❓ 常见问题
关于本章主题最常被搜索的问题,点击展开答案
Sub-agent context 该独立还是共享?
按步数和并行需求分:< 10 步强信息依赖用 shared(一个 LLM 跑全程),30+ 步可并行用 isolated(sub-agent 独立 context + 摘要回传),严肃推理 + 要 traceability 用 partial sharing(摘要 + 关键引用原文)。Anthropic Multi-Agent Research System 用 partial sharing。
Sub-agent 数量越多越好吗?
不是。Anthropic 自建 research system 数据:3-5 个 sub-agent 最佳,超过 5 个 lead agent 协调成本超过收益。整个 system 比 single-agent 多用 4× token,但难题质量提升远大于 token 成本。
Sub-agent 之间怎么通信?
JR omni-report 用 filesystem 作 sub-agent handoff 通道:17 个独立 routine 通过 git commit 异步传信息(Marketing Topics 周一跑完写 marketing-topics/$DATE.md,Growth Playbook 周二 Read 进自己 context)。比 message queue 简单 10 倍,低频 cross-agent 通信够用。
多 agent 系统跑一天大概多贵?
按 Anthropic 自家 research system 数据:单次复杂 query lead agent + 4 sub-agent 总耗 ~80K input + ~15K output token = Sonnet $0.30/query。100 query/天 = $30/天 ≈ $900/月。比 single-agent 贵 4× 但难题正确率提升远超成本。
我没用 Claude Code 的 Agent tool,自己用 LangGraph 实现行吗?
完全可以:LangGraph 是 LangChain 官方的多 agent 框架,节点 = 单 agent、边 = handoff 协议、state 字典 = shared / partial / isolated context 都能配。OpenAI Swarm(轻量)、CrewAI(角色驱动)也是常见选择。三个都和模型解耦。
多 agent 最常见的失败模式是什么?
Sub-agent 信息丢失:lead agent 把 task 拆给 sub-agent,sub-agent 跑完只回一句「done」,lead 不知道做了什么、不能基于结果继续推理。强制要求每个 sub-agent 返回结构化结果(JSON:result + key_findings + sources),不收自由文本。Anthropic Multi-Agent Research System 的硬性约束。