Physical reasoning
physical reasoning prompt example
#TL;DR(中文)
- 这是一个 的小测试:要求模型在脑内做“物理约束 + 稳定性”判断,而不是纯文本知识问答。code
physical reasoning - 适合用于验证:模型是否能遵守常识约束(重心、承重、易碎性、摩擦)、是否能给出可执行步骤。
- 在生产里建议:把约束显式写出来(fragile/heavy/sharp/liquid),并要求输出结构化方案与风险提示。
#Background
This prompt tests an LLM's physical reasoning capabilities by asking it to perform actions on a set of objects.
#How to Apply(中文)
把这个模板迁移到真实任务时,建议把 “objects” 写成更明确的属性集合:
- weight:heavy / light
- fragility:fragile / robust
- shape:flat / cylindrical / sharp
- stability:base area / center of mass
这样模型更容易遵循约束并给出合理的 stacking plan。
#How to Iterate(中文)
- 强制输出格式:(从下到上)+code
Order(每层原因)+codeJustificationcodeRisks - 增加禁止项:例如 “Do not place fragile items under heavy items”
- 加 :让模型在最后检查是否违反了任何约束code
self-check - 加场景变量:桌面大小、是否可用胶带、是否允许开盒/拆包装等
#Self-check rubric(中文)
- 是否给出了明确的 stacking order(从下到上)?
- 是否解释了稳定性依据(base area / center of mass / friction)?
- 是否考虑易碎/液体/尖锐物的风险?
- 是否给出备选方案(如果某个物体不可用)?
#Practice(中文)
练习:把 objects 换成你生活/工作里真实可遇到的组合,并加上约束:
- “不能损坏任何物品”
- “只能用一只手操作”
- “桌面只有 A4 大小”
观察模型是否能稳定地产出可执行方案。
#Prompt
textHere we have a book, 9 eggs, a laptop, a bottle and a nail. Please tell me how to stack them onto each other in a stable manner.
#Code / API
#OpenAI (Python)
pythonfrom openai import OpenAI client = OpenAI() response = client.chat.completions.create( model="gpt-4", messages=[ { "role": "user", "content": "Here we have a book, 9 eggs, a laptop, a bottle and a nail. Please tell me how to stack them onto each other in a stable manner.", } ], temperature=1, max_tokens=500, top_p=1, frequency_penalty=0, presence_penalty=0, )
#Fireworks (Python)
pythonimport fireworks.client fireworks.client.api_key = "<FIREWORKS_API_KEY>" completion = fireworks.client.ChatCompletion.create( model="accounts/fireworks/models/mixtral-8x7b-instruct", messages=[ { "role": "user", "content": "Here we have a book, 9 eggs, a laptop, a bottle and a nail. Please tell me how to stack them onto each other in a stable manner.", } ], stop=["<|im_start|>", "<|im_end|>", "<|endoftext|>"], stream=True, n=1, top_p=1, top_k=40, presence_penalty=0, frequency_penalty=0, prompt_truncate_len=1024, context_length_exceeded_behavior="truncate", temperature=0.9, max_tokens=4000, )