Components
Agent system 的核心组件:Planning / Tool Calling / Memory
#TL;DR(中文)
- 一个可用的 至少要有 3 个核心组件:code
AI Agent、codePlanning、codeTool Utilization。codeMemory - 你可以把它理解为:负责“想”,tools 负责“做”,memory 负责“记”,三者合起来才有稳定的多步骤执行能力。code
LLM - 真正拉开差距的通常不是“换更强的模型”,而是:tool 设计、memory 结构、以及 /可观测性。code
evaluation
#核心概念(中文讲解,术语保留英文)

#1) Planning:让系统具备长链路执行能力
PlanningLLM- 把任务拆成可完成的子任务(task decomposition)
- 维护一个可更新的 plan(例如 to-do list / task tracker)
- 失败时能 retry / fallback,而不是直接输出一段看起来合理但不可验证的文字
建议的落地做法:
- 让 agent 输出显式的 plan(结构化更好),并在每个步骤后更新状态(/code
todo/codein_progress等)codedone - 加上 “finish criteria”:什么条件下算完成,避免无限循环或过早结束
#2) Tool Utilization:把“推理”变成“可验证的行动”
Tool Calling- 需要事实时:用 search / database tool,而不是靠模型记忆
- 需要计算/执行时:用 code execution / calculator tool
- 需要写入外部系统时:用明确的 write tools,并加 allowlist 与审计日志
tool 设计建议(高性价比):
- tool 数量少而精:每个 tool 的输入输出明确、错误可处理
- tool schema 清晰:让 很容易选对 tool、填对参数code
LLM - tool 返回值可引用:便于后续 与 debugcode
evaluation
#3) Memory:让 agent 在多轮、多步骤里不丢状态
Memory- code
Short-term (Working) Memory- 放当前任务的上下文摘要、最新结果、关键约束
- 适合 与短期迭代code
in-context learning
- code
Long-term Memory- 常见做法是 vector store / knowledge base
- 用于跨任务复用(例如用户偏好、历史项目知识、长期笔记)
落地建议:
- 不要把所有原始内容都塞进 memory:优先存 “决策需要的摘要 + 可回溯的引用”
- 为 memory 设计固定模板:例如 /code
facts/codeassumptions/codeopen_questionscodedecisions
#Self-check rubric(中文)
- :plan 是否明确?是否存在 silent skip?失败是否有 retry/fallback?code
Planning - :是否伪造 tool results?是否能正确处理 tool error?code
Tool Calling - :是否能稳定复用关键信息?是否把噪音写进 long-term memory?code
Memory - Observability:是否能回放每个步骤的输入、输出、tool call 与状态变更?
#Practice(中文)
练习:为一个 “customer support agent” 设计组件清单(不写代码也可以)。
- 给出:tools 列表(read/write 分开)、memory 结构(短期/长期分别存什么)、plan 模板(字段有哪些)。
- 说明:哪些 actions 需要 human-in-the-loop(例如退款、改地址、取消订单)。
#References
#Original (English)
AI agents require three fundamental capabilities to effectively tackle complex tasks: planning abilities, tool utilization, and memory management. Let's dive into how these components work together to create functional AI agents.

#Planning: The Brain of the Agent
At the core of any effective AI agent is its planning capability, powered by large language models (LLMs). Modern LLMs enable several crucial planning functions:
- Task decomposition through chain-of-thought reasoning
- Self-reflection on past actions and information
- Adaptive learning to improve future decisions
- Critical analysis of current progress
While current LLM planning capabilities aren't perfect, they're essential for task completion. Without robust planning abilities, an agent cannot effectively automate complex tasks, which defeats its primary purpose.
#Tool Utilization: Extending the Agent's Capabilities
The second critical component is an agent's ability to interface with external tools. A well-designed agent must not only have access to various tools but also understand when and how to use them appropriately. Common tools include:
- Code interpreters and execution environments
- Web search and scraping utilities
- Mathematical calculators
- Image generation systems
These tools enable the agent to execute its planned actions, turning abstract strategies into concrete results. The LLM's ability to understand tool selection and timing is crucial for handling complex tasks effectively.
#Memory Systems: Retaining and Utilizing Information
The third essential component is memory management, which comes in two primary forms:
-
Short-term (Working) Memory
- Functions as a buffer for immediate context
- Enables in-context learning
- Sufficient for most task completions
- Helps maintain continuity during task iteration
-
Long-term Memory
- Implemented through external vector stores
- Enables fast retrieval of historical information
- Valuable for future task completion
- Less commonly implemented but potentially crucial for future developments
Memory systems allow agents to store and retrieve information gathered from external tools, enabling iterative improvement and building upon previous knowledge.
The synergy between planning capabilities, tool utilization, and memory systems forms the foundation of effective AI agents. While each component has its current limitations, understanding these core capabilities is crucial for developing and working with AI agents. As the technology evolves, we may see new memory types and capabilities emerge, but these three pillars will likely remain fundamental to AI agent architecture.