logo
P
Prompt Master

Prompt 大师

掌握和 AI 对话的艺术

Generated Knowledge Prompting

Generate knowledge first, then answer (boosts commonsense / knowledge-intensive tasks)

GENKNOW

Image source: Liu et al. 2022

LLMs keep getting better, and one popular technique is incorporating knowledge or information to help the model make more accurate predictions.

But here's the question: can the model also generate knowledge before making a prediction? That's what Liu et al. 2022 tried -- generate knowledge to use as part of the prompt. Specifically, how much does this help with tasks like commonsense reasoning?

Let's try a simple prompt:

Prompt:

Part of playing golf is trying to get a higher score than other people. Yes or no?

Output:

Yes.

Wrong. This exposes an LLM limitation -- tasks that need real-world knowledge trip it up. So how do we fix this with generated knowledge?

First, we generate some "knowledge":

Prompt:

Input: Greece is larger than Mexico.
Knowledge: Greece has an area of about 131,957 square kilometers, while Mexico has an area of about 1,964,375 square kilometers, making Mexico much larger than Greece.

Input: Glasses always fog up.
Knowledge: Glasses fog up when water vapor from your breath, sweat, or surrounding humidity condenses on a cooler lens surface, forming tiny droplets that look like a thin film.

Input: Fish can think.
Knowledge: Fish can show learning and memory abilities. Some fish demonstrate problem-solving and can retain information over time, which can help them navigate environments and social interactions.

Input: A common effect of smoking many cigarettes over a lifetime is a higher-than-normal chance of developing lung cancer.
Knowledge: Smoking increases the risk of lung cancer compared to never-smokers. The risk generally rises with greater smoking intensity and duration.

Input: A rock and a pebble are the same size.
Knowledge: In sedimentology (Udden-Wentworth scale), a pebble is typically 4 to 64 mm in diameter. Larger clasts (e.g., cobbles) are 64 to 256 mm, while smaller grains are below 4 mm.

Input: Part of playing golf is trying to get a higher score than other people.
Knowledge:

Knowledge 1:

The goal of golf is to complete a set of holes in as few strokes as possible. A round typically includes 18 holes. Each stroke counts toward your total, and the player with the lowest total number of strokes wins.

Knowledge 2:

Golf is a precision club-and-ball sport where players use various clubs to hit a ball into a series of holes on a course in as few strokes as possible. The objective is to finish with the lowest score, where score is the total number of strokes across all holes.

We used the prompt from Liu et al. 2022.

Next step: integrate the knowledge and get a prediction. I reformatted the question as QA format to guide the answer.

Prompt:

Question: Part of playing golf is trying to get a higher score than other people. Yes or no?

Knowledge: The goal of golf is to complete a set of holes in as few strokes as possible. A round typically includes 18 holes. Each stroke counts toward your total, and the player with the lowest total number of strokes wins.

Explanation and answer:

Answer 1 (very high confidence):

No. In golf, the objective is not to get a higher score. Instead, players aim to take as few strokes as possible. The lowest total number of strokes determines the winner.
Question: Part of playing golf is trying to get a higher score than other people. Yes or no?

Knowledge: Golf is a precision club-and-ball sport where players use various clubs to hit a ball into a series of holes on a course in as few strokes as possible. The objective is to finish with the lowest score, where score is the total number of strokes across all holes.

Explanation and answer:

Answer 2 (much lower confidence):

No. In golf, players try to take the fewest strokes possible. The winner is determined by the lowest total number of strokes, not a higher score.

Something interesting happened here. The first answer was very confident, while the second was less so. I've simplified the process for demonstration purposes -- there are more nuances when arriving at the final answer. Check the paper for the full details.