Discoverability surfaces
Training a smaller 'student' model to mimic a larger 'teacher' model's outputs. The student learns the teacher's behavior at a fraction of the inference cost. Distillation is how Claude Haiku, GPT-4o-mini, and Gemini Flash are produced — small, fast, cheap models that capture much of the larger model's behavior on common tasks. Critical to making frontier capability economically viable at scale.
Programmatic SEO, AEO, GEO, AIO, and the schemas behind them.
A content strategy that generates hundreds or thousands of search-optimized pages from structured data using templates.
The practice of structuring website content so that AI assistants and large language models (LLMs) can accurately cite it when answering user questions.
An open protocol that allows websites to notify search engines (Bing, Yandex, Seznam, Naver, and others) about new or updated content in real time.
Structuring content so that answer engines — Google's People-Also-Ask, Reddit pull-quotes, Quora top answers, ChatGPT search results, Perplexity citations — can extract a complete, self-contained answer in 40–80 words.
The subset of GEO/AEO targeted specifically at Google's AI Overviews (formerly SGE).
A Schema.
JavaScript Object Notation for Linked Data — the W3C-standard syntax for embedding structured data in web pages.
A Schema.
This definition is published under CC BY 4.0. Cite as:
The Data Nerd. "Model Distillation." VC Deal Flow Signal Glossary, https://signals.gitdealflow.com/define/distillation.
The free Acceleration Watch turns terms like Model Distillation into five named, accelerating startups every Sunday — translated into plain English, 21 to 47 days before the deck circulates. No code-reading, no card.