Introduction
AI Agent definition, capability boundaries, and typical use cases
TL;DR
- An
AI Agentis a system with anLLMat its core that can plan, call tools, and maintain state through memory to complete multi-step tasks. - When a task requires "research + compute/code + generate a report + iterate" -- that kind of long-chain workflow -- an
AI Agentusually beats a singleLLMcall. - The value of an
AI Agentisn't just "generating text." It's turning strategy into executable actions. - Key risks:
hallucination, tool call failures,Prompt Injection, and uncontrollable behavior from missingevaluation/observability.
Core Concepts
In this track, we define an AI Agent as: an LLM-powered system that can execute actions with a degree of autonomy. It typically has:
Planning: Break complex tasks into steps, update the plan during execution (retry, rollback, gather more info).Tool access: Call external capabilities (APIs, databases, search, code execution) to turn "ideas" into "actions."Memory: Save intermediate results and key facts so multi-turn, multi-step work doesn't lose context.
The biggest difference from a plain "chatbot": an AI Agent's output isn't just text -- it can include tool calls, structured results, and writes to external systems.
How to Apply
To get a working AI Agent off the ground, follow this minimal loop:
- Define task boundaries: What counts as "done"? What actions are allowed/forbidden? (e.g., can it send emails? Can it write to a production database?)
- Design tools (interface-first): Build external capabilities as a small number of clear, verifiable tools. Each tool has well-defined inputs and outputs.
- Lock down output format: Require structured output (JSON schema, tables) whenever possible. Less free-form text means less uncertainty.
- Add guardrails: Use allowlists for high-risk actions. Partition inputs to reduce the blast radius of
Prompt Injection. - Add evaluation and observability: At minimum, you should be able to replay every tool call, key decision, failure reason, and retry path.
A minimal reusable system prompt framework (sketch):
You are an AI Agent for <goal>.
Rules:
- Use tools when needed; do not fabricate tool results.
- If information is missing, ask clarifying questions first.
- Output must follow the specified schema.
- Do not perform disallowed actions: <deny list>.
Workflow:
1) Plan
2) Execute with tools
3) Verify
4) Summarize
Self-check Rubric
Use this rubric to check whether an AI Agent is "usable and controllable":
- Correctness: Can key conclusions be backed by tool results / sources? Any
hallucination? - Task completion: Did it actually finish all subtasks? Any silent skips?
- Tool hygiene: Did it use tools when it should have? Did it fabricate tool output?
- Safety: Are high-risk actions explicitly restricted? Can it resist common
Prompt Injection? - Observability: Can you replay the plan, every tool call, retries, and failure reasons?
Practice
Exercise: Design an AI Agent for "learning material organizer" (no code required).
- Goal: Organize 10 links you provide into a study note and action checklist.
- Constraints: Must cite sources. When uncertain, say "I don't know" and ask clarifying questions.
- Output:
TL;DR, key concept glossary (terms in English), recommended reading order, daily plan (7 days).
References
Original (English)
Agents are revolutionizing the way we approach complex tasks, leveraging the power of large language models (LLMs) to work on our behalf and achieve remarkable results. In this guide we will dive into the fundamentals of AI agents, exploring their capabilities, design patterns, and potential applications.
What is an Agent?

In this guide, we refer to an agent as an LLM-powered system designed to take actions and solve complex tasks autonomously. Unlike traditional LLMs, AI agents go beyond simple text generation. They are equipped with additional capabilities, including:
- Planning and reflection: AI agents can analyze a problem, break it down into steps, and adjust their approach based on new information.
- Tool access: They can interact with external tools and resources, such as databases, APIs, and software applications, to gather information and execute actions.
- Memory: AI agents can store and retrieve information, allowing them to learn from past experiences and make more informed decisions.
This lecture discusses the concept of AI agents and their significance in the realm of artificial intelligence.
Why build with Agents?
While large language models (LLMs) excel at simple, narrow tasks like translation or email generation, they fall short when dealing with complex, broader tasks that require multiple steps, planning, and reasoning. These complex tasks often necessitate access to external tools and information beyond the LLM's knowledge base.
For example, developing a marketing strategy might involve researching competitors, analyzing market trends, and accessing company-specific data. These actions necessitate real-world information, the latest insights, and internal company data, which a standalone LLM might not have access to.
AI agents bridge this gap by combining the capabilities of LLMs with additional features such as memory, planning, and external tools.
By leveraging these abilities, AI agents can effectively tackle complex tasks like:
- Developing marketing strategies
- Planning events
- Providing customer support
Common Use Cases for AI Agents
Here is a non-exhaustive list of common use cases where agents are being applied in the industry:
- Recommendation systems: Personalizing suggestions for products, services, or content.
- Customer support systems: Handling inquiries, resolving issues, and providing assistance.
- Research: Conducting in-depth investigations across various domains, such as legal, finance, and health.
- E-commerce applications: Facilitating online shopping experiences, managing orders, and providing personalized recommendations.
- Booking: Assisting with travel arrangements and event planning.
- Reporting: Analyzing vast amounts of data and generating comprehensive reports.
- Financial analysis: Analyzing market trends, assess financial data, and generate reports with unprecedented speed and accuracy.