logo
07

Context Engineering & Memory

⏱️ 35分钟

Context engineering and memory keep LLM responses relevant without blowing token budgets.

1) Goals

  • Provide just enough context (instructions + facts) for accuracy.
  • Control cost/latency by trimming or structuring history.
  • Maintain conversational continuity where needed.

2) Instruction Hierarchy

  • System: non-negotiable rules (role, language, safety).
  • Task/User: current request and constraints.
  • History: only necessary turns; summarize older content.
  • Tools: function specs and expectations.

3) History Management

  • Sliding window: keep recent N turns.
  • Summarization: compress older history into bullets; include IDs/time.
  • Topical caches: store per-topic summaries; swap in/out as topic changes.
  • Reset triggers: new topic? re-send core instructions; drop stale history.

4) Context Packing for RAG/Chat

  • Strict budget: target ≤ 60-70% of context limit; reserve for output.
  • Ordering: instructions → constraints → retrieved snippets (with IDs) → question.
  • Deduplicate snippets; group by source; include citation IDs.
  • Dynamic selection: choose top-k by relevance + recency + source diversity.

5) Structured Facts

  • Provide facts as bullet lists or key-value blocks, not prose.
  • Use IDs for each fact for citation/traceback.
  • For numbers/dates, keep canonical units and formats.

6) Session Memory Patterns

  • Short-term: recent dialog + working set.
  • Long-term: vector or key-value store of facts/preferences; retrieve by query + tenant/user.
  • Ephemeral: auto-expire or rotate; respect privacy/PII limits.

7) Safety & Leakage Prevention

  • Drop user-provided prompt fragments from summaries to avoid prompt injection persistence.
  • Redact secrets/PII before storing/retrieving.
  • Tag data by tenant/user/region; filter on retrieval.

8) Testing & Validation

  • Token audits: measure context size under typical/peak conditions.
  • Regression checks: ensure core instructions remain present after packing.
  • Topic-switch tests: verify summaries and resets behave.

9) Minimal Checklist

  • Instruction hierarchy enforced; core rules always included.
  • History trimmed/summarized with IDs; budgeted context ≤ 70% of limit.
  • Retrieved snippets deduped, cited, and filtered by tenant.

📚 相关资源