Series · Part 5 of 21
FoundationWhy AI Forgets
Mid-conversation, AI suddenly doesn't remember what you said earlier. This isn't a bug — it's the context window. Here's how it works and how to work around it.
You’re deep in a conversation. You’ve explained your problem, set up context, asked five follow-up questions. Then you ask something simple — and the AI acts like you never introduced yourself.
It’s not a glitch. It’s the context window.
What’s Actually Happening
Every time the AI responds, it reads the entire conversation from the beginning. Not a compressed summary — the actual raw text, token by token, from the first message to the last.
But there’s a limit to how much it can read in one go. That limit is the context window.
When your conversation exceeds it, something has to give. Older messages get cut. The model no longer has access to things you said early on. From its perspective, those messages never existed.
It’s Not “Memory” in the Human Sense
Human memory is associative and persistent — you remember things for years and can connect distant events. The model’s “memory” is just: whatever fits in the current window.
This is why:
- Long research sessions should be broken into fresh chats
- Important context should be repeated or summarized at the start of each session
- “As we discussed earlier…” doesn’t help if earlier is out of the window
Context Windows Are Getting Larger
GPT-3 had 4,096 tokens. GPT-4 Turbo has 128,000. Claude has up to 200,000. Gemini 1.5 can handle 1,000,000.
Bigger windows are better — but they’re also slower and more expensive. Models still struggle with the “lost in the middle” problem: they’re better at remembering the beginning and end of a long context than the stuff in the middle.
Next up: There’s another way AI misleads you — and it’s more subtle than forgetting. Sometimes it remembers confidently, but what it remembers is wrong. That’s hallucination.
AI Demystified · 16 of 21 published
- 0 Grounding 5 Mental Models You Need Before Diving Into AI
- 1 Foundation What Happens When You Ask AI Something?
- 2 Foundation Transformers — The Architecture That Changed Everything
- 3 Foundation How AI Learns, Thinks, and Decides
- 4 Foundation How AI Reads Your Words
- 5 Foundation Why AI Forgets
- 6 Foundation Why AI Lies (And Doesn't Know It)
- 7 Foundation What AI Cannot Do
- 8 Foundation How AI Reasons (And Why It Sometimes Breaks)
- 9 Practice Prompt Engineering — How to Talk to AI
- 10 Practice Embeddings & Vector Databases — The Memory Layer of AI
- 11 Practice RAG Explained — How AI Knows What You Didn't Train It On
- 12 Practice Fine-tuning vs. Prompting — When to Use Which
- 13 Practice Do You Really Need GPT-4?
- 14 Practice Latency, Tokens, and Cost — The Physics of AI Products
- 15 Practice How Do You Know AI Is Actually Working?
- 16 Hands-On Coding Setup — Your AI Development Environment soon
- 17 Hands-On MCP Tool Calling — How AI Uses Tools soon
- 18 Hands-On AI Agents — Beyond Chatbots soon
- 19 Hands-On Build Your First Real AI App soon
- 20 Hands-On Token Optimization — Spend Less, Get More soon
Related posts
How DNS Works — The Internet's Phone Book
You type google.com. Your browser stares at it blankly — it has no idea where that is. Here's how the internet turns a name into an address, and why it's faster than you'd expect.
read more →
How HTTP Works — The Language of the Web
The TLS tunnel is open. Now your browser and server need to speak the same language. Here's what GET, POST, 404, and 200 actually mean — and what really travels across the wire.
read more →Get new posts in your inbox
No spam. No digest. Just a note when I publish something new.
Discussion
On this page