3.5 Context Compaction: How Conversation History Avoids Explosion

Chapter goal: Understand Claude Code's multi-layer context-compaction mechanisms, including trigger conditions and implementation details for each strategy.

The problem: limited context window

Claude's context window is around 200K tokens. A complex coding session can easily exceed this:

each large file read: possibly 5K-50K tokens
detailed tool outputs: possibly 1K-5K tokens each call
long conversation history: 500-2000 tokens per round
System Prompt: around 5K-15K tokens

Claude Code implements multiple compaction mechanisms for this.

Overview of compaction mechanisms

Context size
    │
    ▼
normalizeMessagesForAPI()        ← baseline filtering (always runs)
    │ filter UI-only messages
    │
    ▼
AutoCompact                      ← auto-trigger near limit (>85%)
    │ summarize history
    │ optimized by feature('CACHED_MICROCOMPACT')
    │
    ▼
ReactiveCompact                  ← when prompt_too_long error is received
    │ feature('REACTIVE_COMPACT')
    │
    ▼
ContextCollapse                  ← collapse tool-call results
    │ feature('CONTEXT_COLLAPSE')
    │
    ▼
SnipCompact                      ← history truncation
      feature('HISTORY_SNIP')

normalizeMessagesForAPI()

This is baseline filtering that runs before every API call:

// src/utils/messages.ts
  return messages
    .filter(m => !isUIOnlyMessage(m))  // filter ProgressMessage etc.
    .filter(m => !isTombstoneMessage(m)) // filter deleted placeholders
    .map(m => toAPIFormat(m))
    // ... more processing
}

Filtered-out message types:

ProgressMessage: tool progress updates (UI-only)
TombstoneMessage: placeholder for deleted messages
SystemMessage (specific kinds): UI notification messages

AutoCompact: proactive compaction

Trigger timing

// src/services/compact/autoCompact.ts

  contextTokens: number,
  maxTokens: number,
): TokenWarningState {
  const ratio = contextTokens / maxTokens
  if (ratio < 0.7) return 'safe'      // safe zone
  if (ratio < 0.85) return 'warning'  // warning zone (user sees warning)
  return 'critical'                    // trigger AutoCompact
}

When conversation enters critical zone, queryLoop triggers compaction before next iteration.

Compaction flow

1. Call Claude (small model Haiku) to generate summary
   - Input: all history messages
   - Output: conversation summary text

2. Build compacted message array
   - Keep: System Prompt (unchanged)
   - Replace: history messages → summary UserMessage

3. Continue current Turn
   - Message array shrinks to ~5K tokens
   - Raw history is persisted to disk (not lost)

CACHED_MICROCOMPACT (compile-switch optimization)

const compactModule = feature('CACHED_MICROCOMPACT')
  ? require('./services/compact/microCompact.js')
  : require('./services/compact/compact.js')

MicroCompact uses Anthropic Prompt Cache:

mark compaction boundaries as cacheable
reuse cached summary tokens in consecutive turns
reduce per-call cost and latency

ReactiveCompact: reactive compaction

Trigger condition: API returns prompt_too_long (HTTP 400)

// in src/query.ts
if (feature('REACTIVE_COMPACT') && reactiveCompact) {
  if (reactiveCompact.isPromptTooLongMessage(lastMessage) && 
      !state.hasAttemptedReactiveCompact) {
    // Compact immediately, no wait for next loop
    const compacted = await reactiveCompact.compact(state.messages, ...)
    state = {
      ...state,
      messages: compacted.messages,
      hasAttemptedReactiveCompact: true,  // try only once
    }
    continue  // retry with compacted messages
  }
}

This is a "last-resort" safeguard when AutoCompact did not trigger in time. It runs at most once per Turn to avoid infinite compaction loops.

ContextCollapse: tool-result collapsing

Function: collapse large tool results into summaries

// controlled by feature('CONTEXT_COLLAPSE')
const contextCollapse = feature('CONTEXT_COLLAPSE')
  ? require('./services/contextCollapse/index.js')
  : null

When a tool call (e.g., FileReadTool) returns very large content, ContextCollapse can:

keep full content for the current Turn
replace it with a summary in later Turns ("File X has been read; key content is ...")

Manual compaction: `/compact` command

Users can trigger compaction anytime:

/compact       # use default summary
/compact 请重点保留关于认证系统的上下文  # custom summary instruction

Source path: src/commands/compact/

Compaction boundary (Compact Boundary)

After /compact, a special CompactBoundaryMessage is inserted into history:

// Mark compaction boundary
type CompactBoundaryMessage = {
  type: 'compact_boundary'
  summary: string  // compaction summary content
  uuid: string
}

This boundary is used to:

show "context compacted" in UI
let getMessagesAfterCompactBoundary() send only post-boundary messages to API
indicate compaction status when resuming sessions

Chapter 4: Tool system — the design of 53 tools