logo
P
Prompt Master

Prompt 大师

掌握和 AI 对话的艺术

Model Collection

模型概览与选择建议

TL;DR(中文)

  • 这页是 LLM 的基础索引:帮助你快速建立 “哪些模型出现过、它们大致属于哪一类” 的认知地图,覆盖 GPT-5.1 / GPT-4.1 / o1、Claude 4.5、Gemini 3、Llama 3.1、Grok-2 等。
  • 选模型时不要只看参数/榜单:更重要的是 latency、cost、context length、tool support(例如 Tool Calling)、以及你的 evaluation 结果。
  • 在真实项目里,通常用 “capability model + fast model” 组合:复杂规划/推理用一个更强的 model,日常步骤与批处理用更快更便宜的 model。

2024-2025 实用速览

厂商能力档(多模态/工具)快速/便宜档备注
OpenAIChatGPT 5.1(GPT-5 系列) / GPT-4.1GPT-4o-mini视觉/工具优秀;Mini 适合批处理、自动化
OpenAI(推理)o1 / o1-mini长链推理/规划更强,成本更高
AnthropicClaude 4.5 SonnetClaude 3.5/4.5 Haiku长文档/表格能力好,安全性高
GoogleGemini 3 ProGemini 3 Flash / Flash-Lite1M tokens 上下文,多模态/工具;Flash 系列速度快
Meta(开源)Llama 3/3.1 70BLlama 3.1 8B适合私有化部署,生态丰富
MistralMistral LargeMistral Small性价比模型,多语言表现好
xAIGrok-2Grok-2 mini对时效/联网敏感场景可选

选型建议:先用“小模型跑通 → 大模型抬质量 → eval 回归”。多模态/截图优先 ChatGPT 5.1 / GPT-4o / Gemini 3;长文档/表格优先 Claude 4.5 或 Gemini 3 Pro。

Last updated: 2025-02

中文导读(术语保留英文)

这份列表偏 “foundational + notable” 的历史脉络整理,不保证覆盖所有最新模型与所有版本。建议把它当作:

  • 回溯用:遇到论文/博客提到某个模型时,快速定位来源与年代
  • 选型用:在你做 evaluation 前先建立候选池(candidate set)

如果你要做工程选型,建议至少回答这些问题:

  1. 任务更像是 chatcodingreasoning、还是 RAG
  2. 是否需要 long context?需要多长?(context length)
  3. 是否需要 Tool Calling?是否要 structured output(JSON schema)?
  4. 你能否跑一个小的 evaluation 集合做回归(10-50 条就够起步)?

Data adopted from Papers with Code and Zhao et al. (2023).

Models

ModelRelease DateDescription
BERT2018Bidirectional Encoder Representations from Transformers
GPT2018Improving Language Understanding by Generative Pre-Training
RoBERTa2019A Robustly Optimized BERT Pretraining Approach
GPT-22019Language Models are Unsupervised Multitask Learners
T52019Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
BART2019Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
ALBERT2019A Lite BERT for Self-supervised Learning of Language Representations
XLNet2019Generalized Autoregressive Pretraining for Language Understanding and Generation
CTRL2019CTRL: A Conditional Transformer Language Model for Controllable Generation
ERNIE2019ERNIE: Enhanced Representation through Knowledge Integration
GShard2020GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
GPT-32020Language Models are Few-Shot Learners
LaMDA2021LaMDA: Language Models for Dialog Applications
PanGu-α2021PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
mT52021mT5: A massively multilingual pre-trained text-to-text transformer
CPM-22021CPM-2: Large-scale Cost-effective Pre-trained Language Models
T02021Multitask Prompted Training Enables Zero-Shot Task Generalization
HyperCLOVA2021What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Codex2021Evaluating Large Language Models Trained on Code
ERNIE 3.02021ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
Jurassic-12021Jurassic-1: Technical Details and Evaluation
FLAN2021Finetuned Language Models Are Zero-Shot Learners
MT-NLG2021Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Yuan 1.02021Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
WebGPT2021WebGPT: Browser-assisted question-answering with human feedback
Gopher2021Scaling Language Models: Methods, Analysis & Insights from Training Gopher
ERNIE 3.0 Titan2021ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
GLaM2021GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
InstructGPT2022Training language models to follow instructions with human feedback
GPT-NeoX-20B2022GPT-NeoX-20B: An Open-Source Autoregressive Language Model
AlphaCode2022Competition-Level Code Generation with AlphaCode
CodeGen2022CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
Chinchilla2022Shows that for a compute budget, the best performances are not achieved by the largest models but by smaller models trained on more data.
Tk-Instruct2022Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
UL22022UL2: Unifying Language Learning Paradigms
PaLM2022PaLM: Scaling Language Modeling with Pathways
OPT2022OPT: Open Pre-trained Transformer Language Models
BLOOM2022BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
GLM-130B2022GLM-130B: An Open Bilingual Pre-trained Model
AlexaTM2022AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Flan-T52022Scaling Instruction-Finetuned Language Models
Sparrow2022Improving alignment of dialogue agents via targeted human judgements
U-PaLM2022Transcending Scaling Laws with 0.1% Extra Compute
mT02022Crosslingual Generalization through Multitask Finetuning
Galactica2022Galactica: A Large Language Model for Science
OPT-IML2022OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
LLaMA2023LLaMA: Open and Efficient Foundation Language Models
GPT-42023GPT-4 Technical Report
PanGu-Σ2023PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
BloombergGPT2023BloombergGPT: A Large Language Model for Finance
PaLM 22023A Language Model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM.
Claude 22023Anthropic’s second-gen assistant, improved writing/code quality and safety
Llama 22023Open-weight chat models (7B-70B) widely used for 私有化部署
Mixtral 8x7B2023Sparse Mixture-of-Experts 开源模型,性价比高
Gemini 1.02023Google 多模态模型(Ultra/Pro/ Nano),是 Gemini 系列的首发
Claude 3(Opus/Sonnet/Haiku)2024新一代多模态模型,长文档/表格提取和安全性强
Gemini 1.5 Pro2024最高 1M+ tokens 长上下文,多模态
Gemini 1.5 Flash2024便宜快速的多模态模型,适合批量和实时交互
Mistral Large2024多语言大模型,支持函数调用与长上下文
Grok-1.52024xAI 的长上下文模型,侧重实时性
GPT-4o2024OpenAI 全模态旗舰,快于 GPT-4,支持语音/图像/视频
GPT-4o mini2024低成本、强工具能力的小型模型
Llama 320248B/70B 开源,英文和多语表现强
Llama 3.120248B/70B/405B,升级推理和 128k 上下文
Claude 3.5 Sonnet2024Claude 3.5 主力型号,代码与工具调用强
Claude 3.5 Haiku2024轻量快速版,保留高安全与多模态能力
o12024OpenAI 推理型模型,偏链式思考与规划
o1-mini2024o1 的低价/更快版本
GPT-4.12024“全能” 模型,统一文本/图像/语音,推理与工具调用升级
Grok-22024xAI 新一代模型,提升代码与联网问答
Gemini 2.0 Flash (Exp)2024面向实时/工具场景的 Gemini 2.0 预览版
Gemini 2.0 Pro (Exp)2024Gemini 2.0 预览旗舰,多模态与长上下文
DeepSeek-R12025侧重推理效率的开源模型,支持长上下文与数学/代码
Claude 4.5 Sonnet2025Anthropic 2025 主力模型,长上下文(200K–1M beta)、代码与表格检索更强
ChatGPT 5.1(GPT-5 系列)2025最新 OpenAI 产品面向 Responses API,支持可控推理深度与多模态
Gemini 3 Pro2025Google 超长上下文旗舰(1,048,576 tokens),多模态 + 工具链优化
Gemini 3 Flash / Flash-Lite2025快速/低成本多模态模型,适合产品内实时交互与批处理