What is a Context Window

A context window is the fixed-size buffer of tokens that an AI model can process at once. It includes everything the model sees — the system prompt, conversation history, tool results, and attached data. Anything outside the window does not exist to the model. There is no memory beyond this buffer.

How it works

Every input to an AI model is converted into tokens — roughly 3/4 of a word each. The model has a maximum token count it can process in a single request:

Model	Context window
GPT-4o	128K tokens
Claude 3.5 Sonnet	200K tokens
Claude Opus	200K tokens

The context window contains everything: your system prompt, the conversation so far, any documents you attached, tool call results, and the model's own previous responses. As a conversation grows, older content either stays (consuming space) or gets dropped. Once the window is full, something must go.

For an AI agent running a multi-step task, context management is critical. Each tool call adds input and output to the window. A 20-step debugging session can consume the entire window, leaving no room for the actual fix.

Why it matters

The context window is the fundamental constraint of AI engineering. It determines how much information the model can reason about simultaneously. Too little context and the model makes uninformed decisions. Too much irrelevant context and the model gets distracted or hits the limit.

This is why RAG exists — to select only the most relevant information and inject it into the limited window. It is why MCP resources matter — they give applications structured control over what goes into the context. And it is why agent frameworks need summarization and pruning strategies for long-running tasks.

See How Context Management Works for strategies to work within the context window effectively.

This concept appears in

How Context Management Works — Why AI Forgets and How to Fix It

Referenced by

AI Engineering FAQ

What is a Context Window

How it works

Why it matters

Related

This concept appears in

Referenced by