GraphPrompts

graph-based prompting framework（占位）

TL;DR（中文）

GraphPrompts 适合处理“数据天然是 graph”的任务：knowledge graph、social graph、citation graph、molecule graph、recommendation graph 等。
核心思想是：把 graph 的结构信息（nodes/edges/neighborhood/subgraph）以更可控的方式注入 LLM 的 context，让模型能利用结构而不是只看一段扁平文本。
落地时最常见的两类做法：graph-to-text（把子图线性化）与 retrieve-then-prompt（先检索相关节点/路径/社区，再生成回答）。
关键风险：context 爆炸、结构丢失（线性化信息损失）、以及输出不可验证；建议配合 evaluation 与 self-check rubric。

核心概念（中文讲解，术语保留英文）

GraphPrompts 通常指一类 “graph-aware prompting” 方法：你有一个 graph（nodes + edges），要让 LLM 在下游任务里用到 graph 的结构与属性信息。

在很多真实业务里，graph 信息本来就存在：

CRM/用户关系：用户-用户、用户-公司、用户-行为
内容/搜索：query-document、document-document（引用/相似）、topic graph
课程/学习：lesson-prerequisite、student-lesson、skill graph

如果把这些结构直接丢给模型，常见问题是：信息太多、结构不清晰、模型容易忽略关键关系。因此 GraphPrompts 更强调 “如何选子图 + 如何表达结构 + 如何约束输出”。

How to Apply（中文）

Step 1：明确任务与输出（task-first）

先确定你要做的下游任务类型：

classification（例如 node label）
link prediction（例如推荐/关系预测）
ranking（例如候选节点排序）
question answering（基于 graph 的事实问答）

再确定输出格式（尽量结构化），例如：

label（single label）
top-k candidates（列表 + score）
answer + evidence（答案 + 引用到的 node/edge）

Step 2：选择子图（subgraph selection）

不要把整张 graph 塞进 context。常用的选择策略：

k-hop neighborhood（围绕目标节点取 1-2 hop）
shortest path / random walk（取关联路径）
community / cluster summary（取社区级摘要）
hybrid：先用传统检索/图算法选，再让 LLM 做 re-rank

Step 3：表达结构（graph representation）

三种常见表达方式（按可控性与信息损失权衡）：

Adjacency list：最直接，信息完整，但可能很长
Triples（subject-predicate-object）：适合 knowledge graph
Natural language summary：短，但更容易丢结构细节

建议加一层 “schema” 来降低歧义，例如固定字段：node_id, type, attributes, edge_type。

Prompt Template（保持 code block 英文）

You are a graph-aware assistant.

Task: <describe task>
Output format: <JSON schema or strict format>

Graph schema:
- Node: id, type, attributes
- Edge: source, target, type

Given the subgraph below, solve the task using only the provided graph information.
If the graph lacks sufficient evidence, respond with "Insufficient evidence" and list missing information.

Subgraph:
<nodes>
<edges>

How to Iterate（中文）

减少 context：从 2-hop 缩到 1-hop；先召回 50，再 re-rank 到 10
增加 evidence 约束：要求输出每个结论对应到 node_id/edge（可用于后续验证）
结构化输出：用 JSON schema 固定字段，避免模型生成散文
加入 negative signals：让模型说明“不支持结论的边/节点缺失在哪里”
对比 baselines：同一任务对比 “no graph” vs “graph-to-text” vs “retrieve-then-prompt”

Self-check rubric（中文）

是否只使用了给定 subgraph（没有引入外部知识/猜测）？
输出中的每个结论是否能指向具体的 node_id/edge？
子图选择是否覆盖关键关系（是否漏掉关键 hop/path）？
context 是否可控（token 是否爆炸）？是否出现结构表达歧义？

Practice（中文）

练习：用一个 “course prerequisite graph” 做推荐。

Node：course, skill
Edge：requires, teaches
任务：给定一个 student 的已学课程列表，推荐接下来 3 门课程，并解释每个推荐的 graph evidence（哪些 requires 被满足、缺什么 skill）。

Prompt 大师