How AI Overviews Changed On-Page SEO (and What We Optimize For Now)
AI Overviews didn't kill SEO, but it did rewrite how on-page content has to be structured to earn visibility. The old pattern — a chatty intro, a long exposition, an answer buried in paragraph six — is invisible to the retrieval-augmented systems behind Google's AI Overviews, Perplexity, ChatGPT Search, and Claude's web citations. Those systems chunk content, extract answer passages, and synthesize summaries. Your page either provides clean, citable passages or it gets skipped.
This piece covers the specific on-page patterns we've seen earn the most AI citations across 50+ client sites since AI Overviews rolled out, plus the ones we've abandoned because they stopped working.
What LLM-based search actually rewards
Retrieval-augmented generation (RAG) systems behind AI search work in three phases: retrieve candidate passages from an index, rank them by relevance to the query, synthesize a response citing the top passages. The unit of work isn't the page — it's the passage, typically a 100–300 token chunk. Pages that are structured as a series of distinct, citable passages earn citations. Pages that are a continuous wall of prose with no clear answer boundaries don't.
The specific features we've seen correlate with citation frequency: explicit question-and-answer framing (H2 as question, first paragraph as direct answer), structured lists for enumerable content, numeric specificity (real numbers beat 'many' or 'most'), named entities (product names, company names, people, places), and original data or definitions not available elsewhere. Articles optimized for human scanning tend to also work for LLM chunking — the two visual patterns overlap more than they diverge.
The answer-first pattern that gets cited
Every section that might answer a query should follow the same shape: one-sentence direct answer, two-to-four-sentence context, one-to-two-sentence caveat or nuance. This order — answer, context, caveat — matches how LLMs chunk and synthesize. The first sentence is what gets extracted; the surrounding context is what the LLM uses to confirm the answer is correct and complete.
Concrete example. Question: 'How long does SEO take to work?' Wrong opening: 'SEO is a long game that requires patience and strategic investment across many channels…' Right opening: 'Most SEO programs show meaningful traffic growth in 4–6 months, with full payback on investment in 9–14 months depending on category competitiveness.' The right answer is citable. The wrong answer is filler that AI summarizers skip.
Entity-rich writing is the new on-page gold standard
LLMs grade relevance partly on entity density. A page about email marketing that names Klaviyo, HubSpot, Customer.io, Iterable, Braze, Mailchimp, and Postmark by their full product names scores higher than a page that says 'the leading email platforms.' This isn't keyword stuffing — you're providing the specific entities that anchor the topic in the model's knowledge graph.
Write product names in full on first use. Include version or tier names where relevant. Name people with their role and organization. Cite specific tools, integrations, and data sources by name. If you're describing a process, use real step names from real documentation. All of this makes the page chunk-friendly because each passage carries enough entity context to stand alone when extracted.
What earns AI citations — and what doesn't
Original research, original definitions, and unusual-but-defensible claims get cited. Me-too summaries of other people's content don't. The test is brutal but clarifying: if your article were deleted tomorrow, would the web lose anything? If the answer is no, don't expect citations. The article might still rank for branded queries, but it won't earn the generative-engine visibility that's becoming the main source of long-tail discovery.
The content types we've seen earn the most AI citations across client projects: original survey data with statistically meaningful sample sizes, definitional pieces that crystallize a concept the industry talks about but hasn't formalized (think Andrew Chen's 'Cold Start Problem'), case studies with specific numbers and context, and opinionated-but-defensible frameworks that simplify a messy space. Content that gets cited least: listicles, generic how-tos that restate platform documentation, and AI-generated summaries of AI-generated summaries.
Key takeaways
- Structure content as citable passages: question H2s, direct-answer first sentences, supporting context, caveats last.
- Entity density matters. Use full product names, real tool names, real versions — LLMs reward specificity.
- Original data and original definitions get cited. Me-too summaries don't. If your page could be deleted without the web losing anything, expect minimal AI visibility.
- The unit of retrieval is the passage, not the page. Optimize every H2 section to stand alone when extracted.