LLM From Scratch:60分钟拆解大模型原理与Transformer|匠人学院
《LLM From Scratch:60分钟用代码拆解大模型的每一层》是一场面向技术学习者、工程师和 AI 深度学习爱好者的线上公开课。本次活动不是传统意义上的 AI 科普课,也不是教大家如何写 Prompt、如何使用 RAG 或搭建 AI Agent 的应用课,而是一次真正深入大模型底层结构的技术拆解。课程将从一个核心问题出发:我们每天都在使用 ChatGPT、Claude、Gemini 等大语言模型,但它们到底是如何判断“下一个字该写什么”的?如果你不满足于“大模型只是概率模型”这样的简单解释,而是希望看懂它背后的训练逻辑、计算过程和模型结构,那么这场课非常适合你。 在 60 分钟的直播中,讲师将带大家拆解大模型从原始数据到可对话模型的训练阶段,理解 LLM 并不是像人一样“读书”,而是通过大量数据学习语言模式、预测关系和上下文结构。课程重点会讲解 Transformer 这一大模型的核心架构,并进一步解释 Attention 机制到底在计算什么、为什么它能够成为现代 LLM 的关键能力来源。同时,本次活动还包含纯代码 Demo:不是使用 Cursor、Copilot 或现成工具生成代码,而是真正从零开始演示一个 LLM 的实现思路,帮助学习者建立对模型内部运转方式的直观理解。 本场活动适合具备 Python 基础、了解梯度下降和神经网络等基础深度学习概念的人群,也适合已经在使用 LLM API、ChatGPT 或 Claude,但希望进一步读懂源码、论文、模型微调和底层实现的工程师。讲师 Julie 拥有 Cisco 10 年以上 Global Support 一线技术经验,具备 CCIE 安全与网络双认证,并曾多次在 Cisco 内部 AI 技术大赛中获得 Site 第一名,项目方向涵盖 RAG、AI Agent 和 LLM 对抗攻击研究。通过这场课,学习者不会被承诺“听完就做出 ChatGPT”,但可以获得一个扎实的起点:理解 forward pass、Transformer 层级结构、Attention 的意义,以及未来继续学习模型微调、源码阅读和 AI 工程实践所需的底层认知。 “LLM From Scratch: Decode Every Layer of a Large Language Model in 60 Minutes” is an online technical session designed for engineers, AI learners, and developers who want to move beyond surface-level usage of large language models. This is not a general AI introduction, nor is it a workshop about prompt engineering, RAG application design, or AI Agent product implementation. Instead, this session focuses on the deeper question behind modern AI systems: when we use ChatGPT, Claude, Gemini, or other LLMs, how does the model actually “know” what word or token should come next? The workshop is built for learners who are not satisfied with simple explanations such as “an LLM is just a probability model.” In this 60-minute session, participants will explore what happens between raw training data and a model that can generate human-like responses. The class will explain how large language models process data, learn language patterns, and perform prediction through computation rather than “reading” or “understanding” in the human sense. A major focus of the session is Transformer, the core architecture behind modern LLMs. The instructor will break down why Transformer matters, how Attention works, and what these mechanisms are actually calculating inside the model. One of the key highlights of this event is the live coding demo. Instead of relying on tools like Cursor, Copilot, or prebuilt frameworks to hide complexity, the session will demonstrate the idea of building an LLM from scratch through code. This gives learners a more concrete view of what happens under the hood and helps them connect abstract concepts such as forward pass, token prediction, and model layers with real implementation logic. This event is best suited for people with Python experience and a basic understanding of deep learning concepts such as gradient descent and neural networks. It is also ideal for software engineers, AI engineers, and technically curious learners who already use LLMs or LLM APIs but want to understand the internal mechanics behind them. The instructor, Julie, brings more than 10 years of Global Support experience at Cisco, holds dual CCIE certifications in Security and Networking, and has won multiple Cisco internal AI competition Site First Place awards in areas including RAG, AI Agent, and LLM adversarial research. By the end of the session, participants will not be promised that they can build the next ChatGPT immediately, but they will gain a solid conceptual starting point for reading LLM papers, understanding model fine-tuning, exploring source code, and developing deeper AI engineering capabilities.
发布日期: 2026/6/12
本视频由匠人学院提供,涵盖IT技术相关知识点,帮助你系统学习和提升技能。
