logo
36

Claude Code Internals: Context & Memory Management

⏱️ 45 min

Claude Code Context Management

This chapter explains how Claude Code manages context during long tasks, and translates those practices into reusable engineering patterns. The goal: when you're building agents or debugging long tasks, you can reliably control context structure, cost, and signal density.

  • Context isn't "more is better" -- it's "minimum high-signal set."
  • Claude Code prefers just-in-time reads over bulk loading.
  • Compaction isn't losing information -- it's reorganizing it.
  • The file system is cheap external memory.
  • Tool outputs are the biggest token cost source and need throttling.

What You'll Learn

  • Claude Code's context management principles and techniques
  • How to set a context budget and compaction trigger
  • How to use the file system for progressive disclosure
  • How to control token cost from tool outputs

Core Model

Claude Code's context management breaks down into three layers:

  1. Fixed layer: Long-lived stable rules (e.g., CLAUDE.md, system constraints)
  2. Task layer: Current task goals, acceptance criteria, key facts
  3. Dynamic layer: Search results, tool outputs, execution logs

The core principle: keep the fixed and task layers lightweight and stable. The dynamic layer loads on demand and gets compressed as the task progresses.

Key Practices

1) Context Budgeting

  • Set a budget before starting the task
  • Break the task into small phases, each with a token cost limit
  • Trigger compact when context exceeds a threshold (e.g., 70-80%)

2) Progressive Disclosure

Don't load big chunks all at once. Get file paths or headings first, then read on demand.

  • Use rg to find relevant files
  • Use head/tail or partial reads
  • Only keep information needed for the current decision

3) Tool Output Throttling

Most context cost comes from tool outputs. Control strategies include:

  • Summarize outputs instead of keeping full text
  • Paginate / filter / truncate
  • Prefer structured output (tables / JSON / lists)

4) Compaction Strategy

Compaction's goal is "keep critical info + remove redundancy." Suggested structure:

  • Files touched
  • Decisions made
  • Open questions
  • Next actions

Example Workflow

  1. Load rules: CLAUDE.md + AGENTS.md
  2. Locate target files: rg --files -g "*.md" src/content/learn/ai-engineer
  3. Read only what's needed: Check similar file structures first
  4. Generate content and write
  5. Review context: Any redundant outputs?

Anti-Patterns

  • Loading an entire doc library at once
  • Letting huge tool outputs overwrite context
  • Not compacting during long tasks until it's unmanageable
  • Re-reading the same content without summarizing

Checklist

  • Is the context budget defined?
  • Is there progressive disclosure?
  • Are tool outputs capped?
  • Is there a compaction trigger?
  • Are rules centralized in CLAUDE.md/AGENTS.md?

Practice Task

  • Use this chapter's three-layer model to map context for a real project
  • Set an 80% compaction trigger threshold and design a summary template
  • Pick a tool output scenario and implement pagination or summarization

Skill Metadata

Created: 2025-12-26 Last Updated: 2025-12-26 Author: JR Academy Version: 1.0.0