P
Prompt Master

Prompt 大师

掌握和 AI 对话的艺术

Introduction

理解 AI Agent 的定义、能力边界与典型使用场景

TL;DR(中文)

  • AI Agent 是一个以 LLM 为核心、能够 做 planning、调用 tools、并通过 memory 维持状态的系统,用来完成多步骤任务。
  • 当任务需要“查资料 + 计算/写代码 + 生成报告 + 反复迭代”这类长链路流程时,用 AI Agent 通常比单次 LLM 调用更合适。
  • AI Agent 的价值不只在“生成文本”,而在“把策略转成可执行的 actions”。
  • 关键风险来自:hallucination、工具调用失败、Prompt Injection、以及缺少 evaluation/可观测性导致的不可控行为。

核心概念(中文讲解,术语保留英文)

在本方向里,我们把 AI Agent 定义为:一个由 LLM 驱动、能够在一定自主性(autonomy)下执行 actions 的系统。它通常包含:

  • Planning:把复杂任务拆成步骤,并在执行过程中更新计划(例如重试、回滚、补充信息)。
  • Tool access:调用外部能力(API、Database、Search、Code execution 等),把“想法”变成“行动”。
  • Memory:保存中间产物与关键事实,让多轮、多步骤协作不至于丢上下文。

与单纯的 “chatbot” 最大差异在于:AI Agent 的输出不只是文字,还可能包含 tool call、结构化结果、以及对外部系统的写入。

How to Apply(中文)

落地一个可用的 AI Agent,可以按下面的最小闭环来做:

  1. 定义任务边界:什么算完成?允许/禁止做哪些 actions?(例如是否允许发邮件、是否允许写入生产数据库)
  2. 设计 tools(接口优先):把外部能力做成少量、清晰、可验证的 tools;每个 tool 都有明确输入输出。
  3. 制定输出格式:尽量要求结构化输出(例如 JSON schema、表格),减少自由发挥带来的不确定性。
  4. 加上 guardrails:对高风险动作做 allowlist;对输入做分区,降低 Prompt Injection 影响面。
  5. 加上 evaluation 与观测:至少能回放每次 tool call、关键决策、失败原因与重试路径。

一个最小可复用的 system prompt 框架(示意):

You are an AI Agent for <goal>.

Rules:
- Use tools when needed; do not fabricate tool results.
- If information is missing, ask clarifying questions first.
- Output must follow the specified schema.
- Do not perform disallowed actions: <deny list>.

Workflow:
1) Plan
2) Execute with tools
3) Verify
4) Summarize

Self-check rubric(中文)

你可以用下面的 rubric 检查一个 AI Agent 是否“能用且可控”:

  • Correctness:关键结论能否被 tool results / sources 支撑?是否出现 hallucination
  • Task completion:是否真正完成所有子任务?是否存在 silent skip?
  • Tool hygiene:是否在该用 tool 时用 tool?是否伪造 tool output?
  • Safety:是否对高风险 action 有明确限制?是否能抵抗常见 Prompt Injection
  • Observability:是否可回放计划、每次 tool call、重试与失败原因?

Practice(中文)

练习:为“学习资料整理”设计一个 AI Agent(不写代码也可以)。

  • 目标:把你提供的 10 个链接整理成一份学习笔记与行动清单。
  • 约束:必须给出引用来源;不确定时要明确说 “I don’t know” 并提出澄清问题。
  • 输出:TL;DR、关键概念 glossary(术语英文)、推荐阅读顺序、每日计划(7 天)。

References

Original (English)

Agents are revolutionizing the way we approach complex tasks, leveraging the power of large language models (LLMs) to work on our behalf and achieve remarkable results. In this guide we will dive into the fundamentals of AI agents, exploring their capabilities, design patterns, and potential applications.

What is an Agent?

Agent Components

In this guide, we refer to an agent as an LLM-powered system designed to take actions and solve complex tasks autonomously. Unlike traditional LLMs, AI agents go beyond simple text generation. They are equipped with additional capabilities, including:

  • Planning and reflection: AI agents can analyze a problem, break it down into steps, and adjust their approach based on new information.
  • Tool access: They can interact with external tools and resources, such as databases, APIs, and software applications, to gather information and execute actions.
  • Memory: AI agents can store and retrieve information, allowing them to learn from past experiences and make more informed decisions.

This lecture discusses the concept of AI agents and their significance in the realm of artificial intelligence.

Why build with Agents?

While large language models (LLMs) excel at simple, narrow tasks like translation or email generation, they fall short when dealing with complex, broader tasks that require multiple steps, planning, and reasoning. These complex tasks often necessitate access to external tools and information beyond the LLM's knowledge base.

For example, developing a marketing strategy might involve researching competitors, analyzing market trends, and accessing company-specific data. These actions necessitate real-world information, the latest insights, and internal company data, which a standalone LLM might not have access to.

AI agents bridge this gap by combining the capabilities of LLMs with additional features such as memory, planning, and external tools.

By leveraging these abilities, AI agents can effectively tackle complex tasks like:

  • Developing marketing strategies
  • Planning events
  • Providing customer support

Common Use Cases for AI Agents

Here is a non-exhaustive list of common use cases where agents are being applied in the industry:

  • Recommendation systems: Personalizing suggestions for products, services, or content.
  • Customer support systems: Handling inquiries, resolving issues, and providing assistance.
  • Research: Conducting in-depth investigations across various domains, such as legal, finance, and health.
  • E-commerce applications: Facilitating online shopping experiences, managing orders, and providing personalized recommendations.
  • Booking: Assisting with travel arrangements and event planning.
  • Reporting: Analyzing vast amounts of data and generating comprehensive reports.
  • Financial analysis: Analyzing market trends, assess financial data, and generate reports with unprecedented speed and accuracy.

📚 相关资源

❓ 常见问题

关于本章主题最常被搜索的问题,点击展开答案

AI Agent 和普通 chatbot 到底有什么区别?

本章给的定义:agent 是一个由 LLM 驱动、能在一定 autonomy 下执行 actions 的系统,包含 planning、tool access、memory 三件套。Chatbot 输出只是文字;agent 输出可能包含 tool call、结构化结果,甚至对外部系统的写入(发邮件、改数据库)。差异不在「能不能聊天」,在「能不能动手」。

什么任务才值得用 agent,而不是单次 LLM 调用?

本章原话:当任务需要「查资料 + 计算/写代码 + 生成报告 + 反复迭代」这种长链路流程,agent 比单次调用更合适。反过来:翻译、邮件草稿、单步 classification 不需要 agent,单次调用又快又便宜。判断标准是步骤数 + 是否需要外部信息 + 是否要根据中间结果调整计划。

落地一个能用的 agent,最小闭环要哪几步?

本章给的五步:1) 定义任务边界(什么算完成?哪些 action 禁止?);2) 设计 tools(少量、清晰、可验证);3) 制定结构化输出格式(JSON schema 或表格);4) 加 guardrails(高风险动作 allowlist + 输入分区抗 prompt injection);5) 加 evaluation 与可观测性(tool call 回放、关键决策日志、失败原因)。少一步生产就翻车。

Agent 最常见的几个失败模式是什么?

本章列了四类:hallucination(编造事实或工具结果)、tool 调用失败(工具不可用、参数错、超时)、prompt injection(被外部内容劫持指令)、以及缺少 evaluation/observability 导致行为不可控。前三类是直接故障,第四类更可怕——你根本不知道它出了问题,更别说修。

怎么判断我的 agent「能用且可控」?

本章给了 5 维 self-check rubric:Correctness(结论是否被 tool results / sources 支撑?有没有 hallucination?)、Task completion(所有子任务都完成了吗?有 silent skip 吗?)、Tool hygiene(该用 tool 时用了吗?有没有伪造 tool output?)、Safety(高风险 action 有限制吗?抗 prompt injection 吗?)、Observability(计划、tool call、重试都能回放吗?)。任何一项 fail 就别上生产。