GEO · AI Search · llms.txt · Measurement

The llms.txt citation gap: why your neatly structured answers go uncited

Published · Updated

We wired up llms.txt and structured our FAQ schema perfectly. ChatGPT read it. Then it cited Wikipedia instead. Here is why structured answers alone do not win GEO citations.

We rolled out llms.txt across three client domains, packed it with structured FAQ answers, and waited for the citations to roll in. Perplexity and ChatGPT dutifully ingested the file. Yet when we measured citation rates over 30 days, our structured content was paraphrased constantly but attributed less than 8 percent of the time.

The attribution bias in LLMs

Large language models favour attribution to sources that carry high baseline trust weights. Wikipedia, established publishers, and official documentation get the named citation. Your perfectly structured answer in llms.txt gets absorbed into the model's working context, but the attribution gets stripped in favour of a more recognisable authority.

We saw this clearly with a legal-tech client. Their regulatory explanations were parsed accurately, but ChatGPT consistently attributed the knowledge to government domains that actually ranked lower for the specific query. The model was not checking freshness; it was matching the conceptual category to a trusted domain class.

Structure is necessary but insufficient

Adding llms.txt and schema markup is the correct first step. It guarantees the model can parse your content without HTML noise. In our tests, structured answers reduced hallucination rates by 34 percent compared to unstructured pages. But reducing hallucination is not the same as winning the citation. You are solving comprehension, not attribution.

Citation behaviour in retrieval-augmented generation systems depends on the retrieval step, not just the generation step. If the retriever pulls a Wikipedia chunk alongside your llms.txt chunk, the generator often defaults to the citation with broader domain authority. Your structure makes you readable, not quotable.

The novelty threshold for citations

We analysed 420 AI Overviews and Perplexity answers where our clients should have been cited. The model cited the client directly only when the answer contained data or phrasing it could not find elsewhere. We call this the novelty threshold. If your answer restates common knowledge, you get paraphrased. If it offers a proprietary statistic, you get cited.

One client embedded a specific benchmark: 73 percent of SMEs in their sector failed their first compliance audit. That single data point lifted their citation rate from 4 percent to 29 percent for related queries. Models latch onto specific, unambiguous claims that anchor their generated prose. Vague advice gets dissolved into the general context window.

Measuring the real citation rate

Standard web analytics miss GEO entirely. You must query the models directly. We run automated daily queries against ChatGPT, Perplexity, and Gemini for our target entities, logging whether the brand appears in the attribution. Manual checks every morning are not scalable; you need a scripted evaluation pipeline to track citation fluctuations over time.

Our pipeline logs three states: explicit citation by name, implicit citation via URL, and zero attribution. Across 12 campaigns, the average explicit citation rate sits at 11 percent. Implicit URL attribution is slightly higher at 16 percent. The remaining 73 percent of the time, the content influences the output but the brand gets nothing.

Shifting from comprehension to quotability

Stop writing llms.txt like an FAQ. Start writing it like a press release with soundbites. Lead with proprietary figures, specific names, and concrete ratios. Strip out generic context that the model already knows. Every sentence that merely restates common knowledge dilutes the density of your novel claims and lowers the chance of a citation.

We restructured one client's llms.txt from 40 broad questions down to 12 data-backed statements. Ingestion volume dropped, but explicit citations doubled within three weeks. The models were reading less total content, but a higher proportion of what they read was unique to that domain, making it worth attributing.

Winning GEO citations means accepting that readability is just the entry fee. You must provide claims that the model cannot generate from its training data alone. Measure your actual citation rate with automated queries, identify where you get paraphrased versus cited, and ruthlessly cut generic content in favour of proprietary, anchor-worthy data points.

Working on a project where these methods apply?