GraphPrompts
Graph-based prompting framework: inject graph-structured information into LLM context
TL;DR
GraphPromptsis for tasks where the data is naturally a graph: knowledge graphs, social graphs, citation graphs, molecule graphs, recommendation graphs, etc.- Core idea: inject graph structure information (nodes/edges/neighborhoods/subgraphs) into the
LLMcontext in a more controlled way, so the model can leverage structure instead of just reading flat text. - Two common approaches in practice:
graph-to-text(linearize the subgraph) andretrieve-then-prompt(retrieve relevant nodes/paths/communities first, then generate answers). - Key risks: context explosion, structure loss (linearization drops information), and unverifiable outputs. Pair with
evaluationand aself-check rubric.
Core Concepts
GraphPrompts generally refers to a class of "graph-aware prompting" methods: you have a graph (nodes + edges) and you want the LLM to use the graph's structure and attributes for downstream tasks.
In many real business scenarios, graph information already exists:
- CRM/user relationships: user-user, user-company, user-behavior
- Content/search: query-document, document-document (citation/similarity), topic graph
- Courses/learning: lesson-prerequisite, student-lesson, skill graph
If you dump all this structure directly to the model, the common problems are: too much information, unclear structure, and the model ignoring key relationships. That's why GraphPrompts emphasizes "how to select the subgraph + how to represent the structure + how to constrain the output."
How to Apply
Step 1: Define the Task and Output (Task-First)
First, pin down your downstream task type:
- classification (e.g., node label)
- link prediction (e.g., recommendation/relationship prediction)
- ranking (e.g., candidate node ranking)
- question answering (fact-based QA over a graph)
Then define the output format (keep it structured), for example:
- label (single label)
- top-k candidates (list + score)
- answer + evidence (answer + referenced node/edge)
Step 2: Subgraph Selection
Don't stuff the entire graph into the context. Common selection strategies:
- k-hop neighborhood (1-2 hops around the target node)
- shortest path / random walk (extract relevant paths)
- community / cluster summary (community-level summaries)
- hybrid: use traditional retrieval/graph algorithms first, then let the
LLMre-rank
Step 3: Represent the Structure (Graph Representation)
Three common representation methods (balancing controllability vs. information loss):
- Adjacency list: Most direct, complete information, but can get long
- Triples (subject-predicate-object): Great for knowledge graphs
- Natural language summary: Short, but loses structural detail
Add a "schema" layer to reduce ambiguity -- fix fields like: node_id, type, attributes, edge_type.
Prompt Template
You are a graph-aware assistant.
Task: <describe task>
Output format: <JSON schema or strict format>
Graph schema:
- Node: id, type, attributes
- Edge: source, target, type
Given the subgraph below, solve the task using only the provided graph information.
If the graph lacks sufficient evidence, respond with "Insufficient evidence" and list missing information.
Subgraph:
<nodes>
<edges>
How to Iterate
- Reduce context: Shrink from 2-hop to 1-hop; recall 50 first, then re-rank to 10
- Add evidence constraints: Require that each conclusion maps to a
node_id/edge(for downstream verification) - Structured output: Use JSON schema with fixed fields -- avoid the model writing prose
- Add negative signals: Have the model explain "which edges/nodes are missing that would be needed to support this conclusion"
- Compare baselines: For the same task, compare "no graph" vs "graph-to-text" vs "retrieve-then-prompt"
Self-Check Rubric
- Did the model only use the given subgraph (no external knowledge/guessing)?
- Can each conclusion in the output point to a specific
node_id/edge? - Does the subgraph selection cover key relationships (any missing critical hops/paths)?
- Is the context manageable (no token explosion)? Any ambiguity in the structure representation?
Practice
Exercise: Use a "course prerequisite graph" for recommendations.
- Node:
course,skill - Edge:
requires,teaches - Task: Given a student's completed course list, recommend the next 3 courses and explain the graph evidence for each recommendation (which
requiresedges are satisfied, whichskillnodes are missing).