Introduction
理解 AI Agent 的定义、能力边界与典型使用场景
#TL;DR(中文)
- 是一个以code
AI Agent为核心、能够 做 planning、调用 tools、并通过 memory 维持状态的系统,用来完成多步骤任务。codeLLM - 当任务需要“查资料 + 计算/写代码 + 生成报告 + 反复迭代”这类长链路流程时,用 通常比单次code
AI Agent调用更合适。codeLLM - 的价值不只在“生成文本”,而在“把策略转成可执行的 actions”。code
AI Agent - 关键风险来自:、工具调用失败、code
hallucination、以及缺少codePrompt Injection/可观测性导致的不可控行为。codeevaluation
#核心概念(中文讲解,术语保留英文)
在本方向里,我们把
AI AgentLLM- :把复杂任务拆成步骤,并在执行过程中更新计划(例如重试、回滚、补充信息)。code
Planning - :调用外部能力(API、Database、Search、Code execution 等),把“想法”变成“行动”。code
Tool access - :保存中间产物与关键事实,让多轮、多步骤协作不至于丢上下文。code
Memory
与单纯的 “chatbot” 最大差异在于:
AI Agent#How to Apply(中文)
落地一个可用的
AI Agent- 定义任务边界:什么算完成?允许/禁止做哪些 actions?(例如是否允许发邮件、是否允许写入生产数据库)
- 设计 tools(接口优先):把外部能力做成少量、清晰、可验证的 tools;每个 tool 都有明确输入输出。
- 制定输出格式:尽量要求结构化输出(例如 JSON schema、表格),减少自由发挥带来的不确定性。
- 加上 guardrails:对高风险动作做 allowlist;对输入做分区,降低 影响面。code
Prompt Injection - 加上 evaluation 与观测:至少能回放每次 tool call、关键决策、失败原因与重试路径。
一个最小可复用的 system prompt 框架(示意):
textYou are an AI Agent for <goal>. Rules: - Use tools when needed; do not fabricate tool results. - If information is missing, ask clarifying questions first. - Output must follow the specified schema. - Do not perform disallowed actions: <deny list>. Workflow: 1) Plan 2) Execute with tools 3) Verify 4) Summarize
#Self-check rubric(中文)
你可以用下面的 rubric 检查一个
AI Agent- Correctness:关键结论能否被 tool results / sources 支撑?是否出现 ?code
hallucination - Task completion:是否真正完成所有子任务?是否存在 silent skip?
- Tool hygiene:是否在该用 tool 时用 tool?是否伪造 tool output?
- Safety:是否对高风险 action 有明确限制?是否能抵抗常见 ?code
Prompt Injection - Observability:是否可回放计划、每次 tool call、重试与失败原因?
#Practice(中文)
练习:为“学习资料整理”设计一个
AI Agent- 目标:把你提供的 10 个链接整理成一份学习笔记与行动清单。
- 约束:必须给出引用来源;不确定时要明确说 “I don’t know” 并提出澄清问题。
- 输出:、关键概念 glossary(术语英文)、推荐阅读顺序、每日计划(7 天)。code
TL;DR
#References
#Original (English)
Agents are revolutionizing the way we approach complex tasks, leveraging the power of large language models (LLMs) to work on our behalf and achieve remarkable results. In this guide we will dive into the fundamentals of AI agents, exploring their capabilities, design patterns, and potential applications.
#What is an Agent?

In this guide, we refer to an agent as an LLM-powered system designed to take actions and solve complex tasks autonomously. Unlike traditional LLMs, AI agents go beyond simple text generation. They are equipped with additional capabilities, including:
- Planning and reflection: AI agents can analyze a problem, break it down into steps, and adjust their approach based on new information.
- Tool access: They can interact with external tools and resources, such as databases, APIs, and software applications, to gather information and execute actions.
- Memory: AI agents can store and retrieve information, allowing them to learn from past experiences and make more informed decisions.
This lecture discusses the concept of AI agents and their significance in the realm of artificial intelligence.
#Why build with Agents?
While large language models (LLMs) excel at simple, narrow tasks like translation or email generation, they fall short when dealing with complex, broader tasks that require multiple steps, planning, and reasoning. These complex tasks often necessitate access to external tools and information beyond the LLM's knowledge base.
For example, developing a marketing strategy might involve researching competitors, analyzing market trends, and accessing company-specific data. These actions necessitate real-world information, the latest insights, and internal company data, which a standalone LLM might not have access to.
AI agents bridge this gap by combining the capabilities of LLMs with additional features such as memory, planning, and external tools.
By leveraging these abilities, AI agents can effectively tackle complex tasks like:
- Developing marketing strategies
- Planning events
- Providing customer support
#Common Use Cases for AI Agents
Here is a non-exhaustive list of common use cases where agents are being applied in the industry:
- Recommendation systems: Personalizing suggestions for products, services, or content.
- Customer support systems: Handling inquiries, resolving issues, and providing assistance.
- Research: Conducting in-depth investigations across various domains, such as legal, finance, and health.
- E-commerce applications: Facilitating online shopping experiences, managing orders, and providing personalized recommendations.
- Booking: Assisting with travel arrangements and event planning.
- Reporting: Analyzing vast amounts of data and generating comprehensive reports.
- Financial analysis: Analyzing market trends, assess financial data, and generate reports with unprecedented speed and accuracy.