极简使用：Pipeline API

pipeline() 是库中最简单直接的工具，适合快速部署和测试。

但也别把它当万能方案。原型阶段很适合，到了需要细控制、批量推理、显存优化的时候，通常还是会回到 Auto Classes。

from transformers import pipeline # 1. 情感分析 (Sentiment Analysis) classifier = pipeline("sentiment-analysis") print(classifier("We are very happy to show you the 🤗 Transformers library.")) # 2. 文本生成 (Text Generation) generator = pipeline("text-generation", model="gpt2") print(generator("In this course, we will teach you how to", max_length=30)) # 3. 命名实体识别 (NER) ner = pipeline("ner", grouped_entities=True) print(ner("My name is Sylvain and I work at Hugging Face in Brooklyn."))

标准使用：Auto Classes

当你需要更多控制权时，可以使用 AutoTokenizer 和 AutoModel 类。它们能根据模型名称自动加载正确的架构。

from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch checkpoint = "distilbert-base-uncased-finetuned-sst-2-english" # 加载分词器 tokenizer = AutoTokenizer.from_pretrained(checkpoint) # 加载带分类头的模型 model = AutoModelForSequenceClassification.from_pretrained(checkpoint) sequences = ["I've been waiting for a HuggingFace course my whole life.", "So hate this!"] # 编码输入 inputs = tokenizer(sequences, padding=True, truncation=True, return_tensors="pt") # 前向传播 with torch.no_grad(): outputs = model(**inputs) # 获取逻辑值 (Logits) predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) print(predictions)

常用模型后缀

根据任务选择合适的 AutoModelFor... 类：

AutoModelForCausalLM：因果语言模型（如 GPT-2, Llama）。

AutoModelForMaskedLM：掩码语言模型（如 BERT, RoBERTa）。

AutoModelForSequenceClassification：序列分类。

AutoModelForTokenClassification：词元分类（如 NER）。

AutoModelForQuestionAnswering：问答系统。

# 保存本地 tokenizer.save_pretrained("./my_saved_model") model.save_pretrained("./my_saved_model") # 从本地加载 tokenizer = AutoTokenizer.from_pretrained("./my_saved_model") model = AutoModelForSequenceClassification.from_pretrained("./my_saved_model")

进阶功能：Trainer API

Trainer 是一个高性能、全功能的训练循环实现，它抽象了复杂的 PyTorch 代码。

from transformers import Trainer, TrainingArguments training_args = TrainingArguments( output_dir="./results", num_train_epochs=3, per_device_train_batch_size=16, evaluation_strategy="epoch", ) trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, eval_dataset=eval_dataset, tokenizer=tokenizer, ) trainer.train()

性能优化

Quantization (量化)：使用 bitsandbytes 库加载 4-bit 或 8-bit 模型以节省显存。

Flash Attention：通过 attn_implementation="flash_attention_2" 加速长序列计算。

Device Map：使用 device_map="auto" 实现多显卡模型切分。

真到这一步时，很多问题就不再是“库怎么用”，而是“你的机器顶不顶得住”。所以别把所有性能问题都归到 Transformers 本身，硬件和模型体积同样决定体验。

提示：Transformers 库支持 PyTorch、TensorFlow 和 JAX 三大深度学习框架。

Transformers 库指南

transformers 之所以能成为默认库，不是因为它“功能最多”，而是因为它把“下载模型、加载 tokenizer、跑推理、做训练”这些动作统一到了比较稳定的一套接口里。

#核心设计理念

Transformers 库围绕三个核心对象构建：

Tokenizer：将文本转换为模型可以理解的数字（Input IDs）。
Model：神经网络架构及其权重。
Configuration：存储模型架构参数。

#极简使用：Pipeline API

pipeline() 是库中最简单直接的工具，适合快速部署和测试。

但也别把它当万能方案。原型阶段很适合，到了需要细控制、批量推理、显存优化的时候，通常还是会回到 Auto Classes。

python
from transformers import pipeline

# 1. 情感分析 (Sentiment Analysis)
classifier = pipeline("sentiment-analysis")
print(classifier("We are very happy to show you the 🤗 Transformers library."))

# 2. 文本生成 (Text Generation)
generator = pipeline("text-generation", model="gpt2")
print(generator("In this course, we will teach you how to", max_length=30))

# 3. 命名实体识别 (NER)
ner = pipeline("ner", grouped_entities=True)
print(ner("My name is Sylvain and I work at Hugging Face in Brooklyn."))

#标准使用：Auto Classes

当你需要更多控制权时，可以使用 AutoTokenizer 和 AutoModel 类。它们能根据模型名称自动加载正确的架构。

python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"

# 加载分词器
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# 加载带分类头的模型
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

sequences = ["I've been waiting for a HuggingFace course my whole life.", "So hate this!"]

# 编码输入
inputs = tokenizer(sequences, padding=True, truncation=True, return_tensors="pt")

# 前向传播
with torch.no_grad():
    outputs = model(**inputs)
    # 获取逻辑值 (Logits)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    print(predictions)

#常用模型后缀

根据任务选择合适的 AutoModelFor... 类：

AutoModelForCausalLM：因果语言模型（如 GPT-2, Llama）。
AutoModelForMaskedLM：掩码语言模型（如 BERT, RoBERTa）。
AutoModelForSequenceClassification：序列分类。
AutoModelForTokenClassification：词元分类（如 NER）。
AutoModelForQuestionAnswering：问答系统。

#模型保存与加载

python
# 保存本地
tokenizer.save_pretrained("./my_saved_model")
model.save_pretrained("./my_saved_model")

# 从本地加载
tokenizer = AutoTokenizer.from_pretrained("./my_saved_model")
model = AutoModelForSequenceClassification.from_pretrained("./my_saved_model")

#进阶功能：Trainer API

Trainer 是一个高性能、全功能的训练循环实现，它抽象了复杂的 PyTorch 代码。

python
from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    evaluation_strategy="epoch",
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    tokenizer=tokenizer,
)

trainer.train()

#性能优化

Quantization (量化)：使用 bitsandbytes 库加载 4-bit 或 8-bit 模型以节省显存。
Flash Attention：通过 attn_implementation="flash_attention_2" 加速长序列计算。
Device Map：使用 device_map="auto" 实现多显卡模型切分。

提示：Transformers 库支持 PyTorch、TensorFlow 和 JAX 三大深度学习框架。

Transformers 库指南

核心设计理念

极简使用：Pipeline API

标准使用：Auto Classes

常用模型后缀

模型保存与加载

进阶功能：Trainer API

性能优化

Hugging Face 模型库指南

Transformers 库指南

#核心设计理念

#极简使用：Pipeline API

#标准使用：Auto Classes

#常用模型后缀

#模型保存与加载

#进阶功能：Trainer API

#性能优化

系统设计必备：核心概念 + 经典案例

相关指南

相关路线图