P
Prompt Master

Prompt 大师

掌握和 AI 对话的艺术

PAL

Program-Aided Language Models: use executable code as intermediate reasoning

Gao et al. (2022) proposed a method where LLMs read natural language problems and generate programs as intermediate reasoning steps. Called Program-Aided Language Models (PAL), it differs from chain-of-thought prompting because instead of using free-form text to reach a solution, it offloads the solution steps to a programming runtime like a Python interpreter.

PAL

Image source: Gao et al. (2022)

Let's use LangChain and OpenAI GPT-3 as an example. We want to build a simple app that interprets questions and uses a Python interpreter to compute the answer.

Specifically, we'll create a function that uses an LLM to answer date-understanding questions. We'll provide a prompt with examples adopted from here.

Imports we need:

from datetime import datetime
from dateutil.relativedelta import relativedelta
from langchain.llms import OpenAI
from dotenv import load_dotenv

Set up the environment:

load_dotenv()

# API configuration
openai.api_key = os.getenv("OPENAI_API_KEY")

# for LangChain
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

Set up the model instance:

llm = OpenAI(model_name='text-davinci-003', temperature=0)

Set up the prompt + question:

question = "Today is 27 February 2023. I was born exactly 25 years ago. What is the date I was born in MM/DD/YYYY?"

DATE_UNDERSTANDING_PROMPT = """
# Q: 2015 is coming in 36 hours. What is the date one week from today in MM/DD/YYYY?
# If 2015 is coming in 36 hours, then today is 36 hours before.
today = datetime(2015, 1, 1) - relativedelta(hours=36)
# One week from today,
one_week_from_today = today + relativedelta(weeks=1)
# The answer formatted with %m/%d/%Y is
one_week_from_today.strftime('%m/%d/%Y')
# Q: The first day of 2019 is a Tuesday, and today is the first Monday of 2019. What is the date today in MM/DD/YYYY?
# If the first day of 2019 is a Tuesday, and today is the first Monday of 2019, then today is 6 days later.
today = datetime(2019, 1, 1) + relativedelta(days=6)
# The answer formatted with %m/%d/%Y is
today.strftime('%m/%d/%Y')
# Q: The concert was scheduled to be on 06/01/1943, but was delayed by one day to today. What is the date 10 days ago in MM/DD/YYYY?
# If the concert was scheduled to be on 06/01/1943, but was delayed by one day to today, then today is one day later.
today = datetime(1943, 6, 1) + relativedelta(days=1)
# 10 days ago,
ten_days_ago = today - relativedelta(days=10)
# The answer formatted with %m/%d/%Y is
ten_days_ago.strftime('%m/%d/%Y')
# Q: It is 4/19/1969 today. What is the date 24 hours later in MM/DD/YYYY?
# It is 4/19/1969 today.
today = datetime(1969, 4, 19)
# 24 hours later,
later = today + relativedelta(hours=24)
# The answer formatted with %m/%d/%Y is
today.strftime('%m/%d/%Y')
# Q: Jane thought today is 3/11/2002, but today is in fact Mar 12, which is 1 day later. What is the date 24 hours later in MM/DD/YYYY?
# If Jane thought today is 3/11/2002, but today is in fact Mar 12, then today is 3/12/2002.
today = datetime(2002, 3, 12)
# 24 hours later,
later = today + relativedelta(hours=24)
# The answer formatted with %m/%d/%Y is
later.strftime('%m/%d/%Y')
# Q: Jane was born on the last day of Feburary in 2001. Today is her 16-year-old birthday. What is the date yesterday in MM/DD/YYYY?
# If Jane was born on the last day of Feburary in 2001 and today is her 16-year-old birthday, then today is 16 years later.
today = datetime(2001, 2, 28) + relativedelta(years=16)
# Yesterday,
yesterday = today - relativedelta(days=1)
# The answer formatted with %m/%d/%Y is
yesterday.strftime('%m/%d/%Y')
# Q: {question}
""".strip() + '\n'
llm_out = llm(DATE_UNDERSTANDING_PROMPT.format(question=question))
print(llm_out)

This outputs:

# If today is 27 February 2023 and I was born exactly 25 years ago, then I was born 25 years before.
today = datetime(2023, 2, 27)
# I was born 25 years before,
born = today - relativedelta(years=25)
# The answer formatted with %m/%d/%Y is
born.strftime('%m/%d/%Y')

The llm_out is Python code, so we can execute it with exec:

exec(llm_out)
print(born)

This outputs: 02/27/1998

📚 相关资源

❓ 常见问题

关于本章主题最常被搜索的问题,点击展开答案

PAL 是什么?跟 CoT 区别是什么?

PAL(Program-Aided Language Models,Gao 等人 2022)让 LLM 把推理步骤写成代码而不是自由文本——LLM 输出 Python,交给解释器执行得到答案。CoT 走自然语言推理,遇到算术容易算错;PAL 把计算卸载给 Python 运行时,数学绝对正确。CoT 容易在数字加减乘除上出错,PAL 直接绕过这个问题。

PAL 是用代码代替整个推理过程吗?

不是替换全部——是替换「需要精确计算」的那部分。LLM 仍然负责自然语言理解(题目要算什么、用什么变量、如何拆步骤),把这些理解翻译成代码后由解释器执行。换句话说,LLM 做语义解析 + 规划,Python 做精确执行,两者分工。

PAL 适合什么任务?

数学题、日期计算、单位换算、统计聚合——所有「能用程序精确计算的问题」。论文 demo 是日期理解:「今天 2023 年 2 月 27 日,我恰好 25 年前出生,我的生日 MM/DD/YYYY 是?」LLM 生成 `today - relativedelta(years=25)` 的 Python 代码,exec 后输出 02/27/1998。也适合财务、化学计量、表格统计。

PAL 怎么实现?要专门的解释器吗?

标准做法:用 few-shot 把每个示例写成「注释 + Python」格式 → LLM 续写代码 → 用 `exec(llm_out)` 执行 → 取结果变量。论文 demo 用 LangChain + OpenAI + datetime / dateutil。生产环境必须用 sandbox(如受限子进程或 wasm 运行时)——直接 exec LLM 输出在生产是安全炸弹。

PAL 和 ReAct + 计算器工具是同一回事吗?

思路相近,但 PAL 更激进。ReAct + Calculator 让 LLM 调一次单步算术;PAL 直接让 LLM 写整个程序——多变量、循环、import 库都行,是「LLM = 代码作者,Python = 执行者」。OpenAI Code Interpreter / ChatGPT Advanced Data Analysis 本质就是 PAL 的产品化形态。