πŸ€– AI / ML

How LLMs Work β€” A Step-by-Step Explainer

From tokenization to next-token prediction: follow the full pipeline of a large language model with interactive demos at every stage.

You type a question. The model types an answer. But what happens in between? This interactive explainer walks through every stage of the LLM pipeline β€” from raw text to probability distribution.

HOW IT WORKS

Large Language Models

From raw text to intelligent output

βœ‚οΈ
1. Tokenization
Text β†’ Numbers

The input text is split into tokens β€” small chunks of characters. Each token gets a unique ID number. LLMs don't read words, they read numbers.

"The cat sat"
↓
"The"
ID: 464
" cat"
ID: 3797
" sat"
ID: 3332
TOKENS β†’ TOKEN IDs
βœ‚οΈ Text β†’ Numbers
🌌 Numbers β†’ Vectors
πŸ‘οΈ Which words matter?
πŸ” Deep processing
🎯 Probabilities β†’ Word

The Pipeline in 5 Steps

StepWhat Happens
1. TokenizationText is split into tokens, each mapped to a unique ID
2. EmbeddingsToken IDs become high-dimensional vectors
3. AttentionThe model figures out which words relate to which
4. Transformer LayersDeep stacks refine understanding from syntax to meaning
5. Next Token PredictionA probability distribution picks the most likely next word

Each step has an interactive demo β€” watch tokenization happen live, see attention weights in action, or observe the model choosing between candidate tokens.

Why This Matters

LLMs are often treated as magic. They’re not. The pipeline is well-understood, mathematically elegant, and surprisingly accessible. Once you see each stage individually, the β€œblack box” dissolves into engineering.