A Large Language Model is a neural network trained on massive text datasets to predict and generate human-like text. Examples include GPT-4, Claude, and Gemini. Scale determines capability. An LLM with billions of parameters can perform tasks that smaller models cannot. It can reason through multi-step problems, write code, analyze documents, and engage in complex dialogue.
You show the model a word and the words before it, and it learns to predict the next word. Do this billions of times across text from the entire internet, and something emerges. The model develops an internal understanding of language, logic, and concepts. It learns that certain word sequences correlate with other sequences. It learns patterns about how humans think and write.
This emergent behavior, where capability arises from scale without explicit programming, is what makes LLMs different from previous AI systems. LLMs don't follow rules. They generate text token by token, always choosing the most statistically likely continuation. So, they can hallucinate plausible-sounding falsehoods. They're pattern-matching machines, not knowledge databases.
Interactive Visualizer
Large Language Model (LLM)
Interactive visualization showing how model size affects capabilities and how attention mechanisms work during text processing.
Model Scale
Attention Mechanism
Click on any word to see how the model pays attention to different parts of the text.