Claude Code Internals: Context & Memory Management
Claude Code Context Management
This chapter explains how Claude Code manages context during long tasks, and translates those practices into reusable engineering patterns. The goal: when you're building agents or debugging long tasks, you can reliably control context structure, cost, and signal density.
- Context isn't "more is better" -- it's "minimum high-signal set."
- Claude Code prefers just-in-time reads over bulk loading.
- Compaction isn't losing information -- it's reorganizing it.
- The file system is cheap external memory.
- Tool outputs are the biggest token cost source and need throttling.
What You'll Learn
- Claude Code's context management principles and techniques
- How to set a context budget and compaction trigger
- How to use the file system for progressive disclosure
- How to control token cost from tool outputs
Core Model
Claude Code's context management breaks down into three layers:
- Fixed layer: Long-lived stable rules (e.g., CLAUDE.md, system constraints)
- Task layer: Current task goals, acceptance criteria, key facts
- Dynamic layer: Search results, tool outputs, execution logs
The core principle: keep the fixed and task layers lightweight and stable. The dynamic layer loads on demand and gets compressed as the task progresses.
Key Practices
1) Context Budgeting
- Set a budget before starting the task
- Break the task into small phases, each with a token cost limit
- Trigger compact when context exceeds a threshold (e.g., 70-80%)
2) Progressive Disclosure
Don't load big chunks all at once. Get file paths or headings first, then read on demand.
- Use
rgto find relevant files - Use
head/tailor partial reads - Only keep information needed for the current decision
3) Tool Output Throttling
Most context cost comes from tool outputs. Control strategies include:
- Summarize outputs instead of keeping full text
- Paginate / filter / truncate
- Prefer structured output (tables / JSON / lists)
4) Compaction Strategy
Compaction's goal is "keep critical info + remove redundancy." Suggested structure:
- Files touched
- Decisions made
- Open questions
- Next actions
Example Workflow
- Load rules: CLAUDE.md + AGENTS.md
- Locate target files:
rg --files -g "*.md" src/content/learn/ai-engineer - Read only what's needed: Check similar file structures first
- Generate content and write
- Review context: Any redundant outputs?
Anti-Patterns
- Loading an entire doc library at once
- Letting huge tool outputs overwrite context
- Not compacting during long tasks until it's unmanageable
- Re-reading the same content without summarizing
Checklist
- Is the context budget defined?
- Is there progressive disclosure?
- Are tool outputs capped?
- Is there a compaction trigger?
- Are rules centralized in CLAUDE.md/AGENTS.md?
Related Pages
- Context Engineering Fundamentals
- Context Compression Strategies
- Claude Code Examples
- Tool Design for Agents
Practice Task
- Use this chapter's three-layer model to map context for a real project
- Set an 80% compaction trigger threshold and design a summary template
- Pick a tool output scenario and implement pagination or summarization
Skill Metadata
Created: 2025-12-26 Last Updated: 2025-12-26 Author: JR Academy Version: 1.0.0