How AI decides what to cite, rank, and surface

We read the papers so you can ship the changes. Each article distills a recent arXiv paper on LLM retrieval, citation behavior, or AI-driven search into something you can act on.

Recent Digests 18 articles
01
Jun 4, 2026 10 min read

ChatGPT referral spikes can overstate AEO unless you control for platform growth

Why “2x on ChatGPT” stories can be misleading without a tailwind control

arXiv:2606.04362
02
Jun 2, 2026 7 min read

LLMs Score 94% on Cultural Knowledge Tests and 40% When the Answer Choices Are Removed

When Cultural Knowledge Doesn't Transfer to Cultural Reasoning

arXiv:2606.01879
03
Jun 2, 2026 7 min read

English Prompts Suppress Bengali Cultural Knowledge Even When Local Evidence Is Provided

How Prompt Language Rewrites Cultural Knowledge Before the Model Even Answers

arXiv:2605.30481
04
Jun 2, 2026 7 min read

LLM Fact-Checkers Score Well But Retrieve the Wrong Sources

Where LLM Fact-Checkers Go Wrong on Sources

arXiv:2605.30241
05
Jun 2, 2026 7 min read

An LLM Agent That Learns From Its Own Mistakes Beats Human Fact-Checkers on Health Misinformation 89% of the Time

When the Agent Learns From Its Own Corrections

arXiv:2606.02215
06
Jun 1, 2026 7 min read

AI Overviews Sent Users to Reddit. AI Mode Is Taking Them Back.

Google AI Overviews drove a 12% rise in Reddit engagement, but AI Mode reversed those gains for experiential communities by substituting conversation for human discussion.

arXiv:2605.16428
07
Jun 1, 2026 7 min read

Ecosystem GEO Beats Page-Level Optimization by Up to 31 Points for Agent Search

Coordinating a multi-page evidence ecosystem raises LLM search agent recommendation rates by up to 31 percentage points over the best single-page GEO baseline.

arXiv:2605.12887
08
Jun 1, 2026 7 min read

When AI Cites AI: The Synthetic Source Problem in Generative Search

An audit of ChatGPT, Copilot, Gemini, and Perplexity finds ~16% of cited sources are AI-generated — with Copilot citing synthetic content in nearly 3 of every 10 citations.

arXiv:2605.23684
09
Jun 1, 2026 6 min read

Frontier LLMs Hallucinate Up to 38% of Scientific Citations

Six frontier LLMs hallucinate 12–38% of scientific citations; a new agentic retrieval system hits zero hallucination at 30% better F1 and $0.05 per query.

arXiv:2605.14306
10
Jun 1, 2026 6 min read

Rewording a Buying Question Changes the Brands AI Recommends More Than Switching Models Does

Cosmetic prompt rewording drops AI brand-recommendation overlap by 21–32 percentage points — more divergence than switching providers entirely, across 12,000 runs.

arXiv:2605.27440
11
Jun 1, 2026 6 min read

Query-Specific Expiry: Why 'Recent' Isn't the Same as 'Fresh'

Baidu's Aurora-Expiry uses RAG-augmented LLMs to infer query-specific expiration thresholds, cutting median document age 12.81% for time-sensitive queries in a 14-day live A/B test.

arXiv:2605.13052
12
Jun 1, 2026 8 min read

RAG Doesn't Flatten the Brand Hierarchy — It Just Moves Where You Lose

A 37,000-run audit of 533 brands finds RAG preserves the brand hierarchy: L4–L5 specialists face 48–52% invisibility while L1 leaders surface universally but convert at only 25–41%.

arXiv:2605.27439
13
May 31, 2026 7 min read

Semantic Metadata Makes Agents More Reliable, Not Smarter

Schema.org markup gives retrieval agents 65.7% higher FAIR-compliant precision — but cuts query coverage by 29% where publishers haven't adopted it.

arXiv:2605.28787
14
May 26, 2026 6 min read

One Bad Search Result Breaks Frontier AI Agents — Completely

A Microsoft study shows a single false top search result drops GPT-5 accuracy from 65% to 18% — while humans solve the same queries at 93% — exposing a critical gap in agentic RAG deployments.

arXiv:2603.00801
15
May 22, 2026 7 min read

Most Pages Get Zero AI Citations. Editing 5% Won 40% More.

A new paper from Virginia Tech maps four failure modes that prevent pages from being cited in AI-generated responses. 43% of relevant pages receive zero citations under baseline conditions.

arXiv:2603.09296
16
May 22, 2026 7 min read

AI Cites Sources It Never Checked — and Half Don't Hold Up

No LLM verifies even half its citations under any tested condition — and adding temporal cutoffs or other deployment constraints collapses verifiability to near zero.

arXiv:2603.07287
17
May 22, 2026 8 min read

Generative Search Citation Share Is a Noisy Estimator, Not a Score

A new statistical framework shows that single-run citation share metrics from Perplexity, SearchGPT, and Gemini carry confidence intervals wide enough to make most apparent SEO gains statistically indistinguishable from noise.

arXiv:2603.08924
18
May 22, 2026 7 min read

Schema Markup Barely Helps AI Find You. Restructuring Does.

Rewriting pages as entity documents lifted AI answer accuracy ~30%; adding JSON-LD did almost nothing — it gets cut before indexing.

arXiv:2603.10700