Can the distribution of few-shot examples bias the model toward one label?

Yes, but weaker than people fear. The chapter tests it: with a skewed 8 positive + 2 negative few-shot set, the model still labelled the ambiguous `I feel something` as negative — not following the majority. Sentiment is a domain the model knows well. Takeaway: skewed distribution barely moves a familiar task but amplifies bias on tasks it knows poorly — so balance examples by default.

How should I order few-shot examples?

Randomise, do not put all positives first and all negatives last. The chapter notes the more skewed the distribution, the more order amplifies the bias. Production tip: store examples as an array and either shuffle before each call, or freeze one verified order and log it in the changelog — otherwise you will see outputs shift after an `unrelated` change and never trace why.

How do I tell if my prompt is already poisoned by bias?

Run a controlled test: same input set, two prompt variants — original example order vs reversed, original distribution vs flipped. If outputs shift noticeably, the prompt is contaminated. The chapter recommends `running many experiments to reduce this kind of bias`. The engineering version is to bake those controlled tests into your evaluation pipeline and run them on every prompt change.

How much worse does bias get on tasks the model is unfamiliar with?

The chapter flags this: the model is hard to bias on familiar tasks like sentiment, but `for harder tasks where the model lacks knowledge it can be much more difficult`. In domain-specific work, internal jargon or low-resource languages, distribution and order bias are amplified. The countermeasures: double the example count, force balance, and add retrieval to pump real domain knowledge into the context.

Should bias mitigation happen at the prompt layer, the model layer, or the data layer?

The chapter is explicit: `some bias can be mitigated by effective prompting, but more advanced solutions like moderation and filtering may be required`. The prompt layer handles local bias — example distribution, ordering, wording. Systemic bias — gender, race, cultural default assumptions — needs data curation, RLHF and output moderation. Prompting is the first line of defence, not the only one.

Biases

How the distribution and order of few-shot examples affect outputs

LLMs can produce problematic outputs that negatively impact model performance on downstream tasks and display biases that degrade results. Some of these can be mitigated through effective prompting strategies, though harder cases may require more advanced solutions like moderation and filtering.

Distribution of Exemplars

When doing few-shot learning, does the distribution of exemplars affect model performance or bias the model in some way? Here's a simple test.

Prompt:

Q: I just got the best news!
A: positive

Q: We just got a raise at work!
A: positive

Q: I'm very proud of what I accomplished today.
A: positive

Q: I had a great day today!
A: positive

Q: I'm really looking forward to the weekend.
A: positive

Q: I just got the best gift!
A: positive

Q: I'm very happy right now.
A: positive

Q: I'm lucky to have such an amazing family.
A: positive

Q: The weather outside is very gloomy.
A: negative

Q: I just heard some terrible news.
A: negative

Q: That feels unpleasant.
A:

Output:

negative

In the example above, the skewed distribution of exemplars doesn't seem to bias the model. Good. Now let's try a harder-to-classify example and see how the model handles it:

Prompt:

Q: The food here is delicious!
A: positive

Q: I'm tired of this course.
A: negative

Q: I can't believe I failed the exam.
A: negative

Q: I had a great day today!
A: positive

Q: I hate this job.
A: negative

Q: The service here is terrible.
A: negative

Q: I feel very depressed about my life.
A: negative

Q: I never get a break.
A: negative

Q: This meal tastes awful.
A: negative

Q: I can't stand my boss.
A: negative

Q: I feel something.
A:

Output:

negative

That last sentence is pretty subjective. So I flipped the distribution — used 8 positive examples and 2 negative — then tried the exact same sentence again. Guess what? The model answered "positive." The model probably has a lot of built-in knowledge about sentiment classification, so it's hard to get it to show bias on this topic. The takeaway: avoid skewed distributions and provide a more balanced number of examples for each label. For harder tasks where the model has less prior knowledge, this becomes a bigger problem.

Order of Exemplars

When doing few-shot learning, does the order of exemplars affect model performance or bias things?

You can try the examples above and see if reordering makes the model lean toward a particular label. The recommendation: randomize the order of exemplars. For example, avoid putting all positive examples first and negative examples last. This problem gets amplified if the label distribution is already skewed. Make sure to experiment extensively to reduce this type of bias.

📚 相关资源

❓ 常见问题

关于本章主题最常被搜索的问题，点击展开答案

Few-shot 示例的「分布」会让模型偏向某个标签吗？

会，但比想象中弱。本章实测：8 条 positive + 2 条 negative 的偏斜分布下，模型对模糊句子 `I feel something` 仍然回答了 negative，没有被多数派带跑。原因是情感分类这种任务模型有大量预训练知识。结论：在模型熟悉的任务上偏斜分布影响小，在它知识薄弱的任务上影响会放大——所以默认就要均衡。

Few-shot 示例的「顺序」要怎么排？

随机排，不要把所有 positive 放前面、所有 negative 放后面。本章建议：分布越偏斜，顺序的影响越被放大。生产用法：把示例写成数组，每次调用前 shuffle，或者固定一个验证过的顺序并在 changelog 记录，避免「顺序变了输出也变了」却查不出原因。

怎么判断我的 prompt 已经被偏见污染了？

做对照实验：同一组样本跑两次——示例顺序原版 vs 反转版、示例分布原版 vs 翻转版。如果输出明显变化，prompt 就有问题。本章建议「进行大量实验，以减少这种类型的偏见」，工程化做法是把这套对照测试写进 evaluation pipeline，每次改 prompt 都跑一次。

在模型「不熟」的领域，偏见放大有多严重？

本章提示：模型对情感分类这种熟悉任务很难被带偏，但「对模型没有太多知识的更难的任务，它可能会更加困难」。换句话说，垂直行业、内部业务术语、低资源语种这类领域，分布和顺序的偏见会被显著放大。这类场景的对策是：示例数量翻倍 + 强制均衡 + 加 retrieval 把领域知识灌进 context。

Prompt 层面的偏见缓解，还是要靠模型层 / 数据层？

本章直说：「其中一些可以通过有效的提示策略来缓解，但可能需要更高级的解决方案，如调节和过滤」。Prompt 层能解决「示例分布、顺序、措辞」这类局部偏见；系统性偏见（性别、种族、文化默认假设）必须靠数据筛选、RLHF、output moderation 这些更深的手段。Prompt 是第一道防线，不是唯一一道。

Prompt 大师

Distribution of Exemplars

Order of Exemplars

📚 相关资源

❓ 常见问题