🔮The Codex

Context Window

The maximum amount of text an AI can consider at once during a conversation.

📖 Apprentice Explanation

The context window is like an AI's short-term memory. A bigger context window means the AI can remember more of your conversation and process longer documents.

🧙 Archmage Notes

Context windows range from 4K tokens (early GPT-3) to 200K+ (Claude 3). Techniques like sliding window attention, sparse attention, and RoPE scaling extend effective context. Longer contexts increase compute costs quadratically with standard attention.