Biases

few-shot examples 的分布与顺序对输出的影响

LLMs 可能会产生问题的生成结果，这些结果可能会对模型在下游任务上的性能产生负面影响，并显示可能会恶化模型性能的偏见。其中一些可以通过有效的提示策略来缓解，但可能需要更高级的解决方案，如调节和过滤。

范例的分布

在进行少样本学习时，范例的分布是否会影响模型的性能或以某种方式使模型产生偏见？我们可以在这里进行简单的测试。

提示：

Q: I just got the best news!
A: positive

Q: We just got a raise at work!
A: positive

Q: I'm very proud of what I accomplished today.
A: positive

Q: I had a great day today!
A: positive

Q: I'm really looking forward to the weekend.
A: positive

Q: I just got the best gift!
A: positive

Q: I'm very happy right now.
A: positive

Q: I'm lucky to have such an amazing family.
A: positive

Q: The weather outside is very gloomy.
A: negative

Q: I just heard some terrible news.
A: negative

Q: That feels unpleasant.
A:

输出：

negative

在上面的例子中，范例的分布似乎不会使模型产生偏见。这很好。让我们尝试另一个更难分类的例子，看看模型的表现如何：

提示：

Q: The food here is delicious!
A: positive

Q: I'm tired of this course.
A: negative

Q: I can't believe I failed the exam.
A: negative

Q: I had a great day today!
A: positive

Q: I hate this job.
A: negative

Q: The service here is terrible.
A: negative

Q: I feel very depressed about my life.
A: negative

Q: I never get a break.
A: negative

Q: This meal tastes awful.
A: negative

Q: I can't stand my boss.
A: negative

Q: I feel something.
A:

输出：

negative

虽然最后一句话有点主观，但我翻转了分布，使用了 8 个积极的例子和 2 个消极的例子，然后再次尝试了完全相同的句子。你猜模型的回答是什么？它回答“积极”。对于这个问题，模型可能有很多关于情感分类的知识，因此很难让它显示出偏见。这里的建议是避免偏斜分布，而是为每个标签提供更平衡的例子数量。对于模型没有太多知识的更难的任务，它可能会更加困难。

范例的顺序

在进行少样本学习时，范例的顺序是否会影响模型的性能或以某种方式使模型产生偏见？

你可以尝试上面的例子，看看是否可以通过改变顺序使模型对某个标签产生偏见。建议随机排序范例。例如，避免先放所有的积极例子，然后最后放消极例子。如果标签的分布偏斜，这个问题会进一步放大。一定要进行大量实验，以减少这种类型的偏见。

Prompt 大师

范例的分布

范例的顺序