GEO · AI Search · Measurement · LLMs

The brand mention variant problem in GEO measurement

Published · Updated

AI engines cite you in ways you never planned. We tracked 12 unexpected mention variants across Perplexity and ChatGPT, revealing a massive blind spot in standard GEO tracking.

When we first measured generative engine optimisation for a client, we counted exact brand mentions. The reports looked weak. The reality was different: language models cited the client constantly, just not by the exact brand string we configured in our trackers.

The exact-match trap

Standard SEO monitoring conditions you to track precise strings. GEO breaks this habit. Large language models paraphrase, abbreviate, and restructure entity names based on context. If you only track 'Neurolinks GmbH', you miss the six other ways models refer to you.

We audited 340 citations across Perplexity, ChatGPT, and AI Overviews for three clients. Exact brand string matches accounted for just 38 percent of all positive citations. The remaining 62 percent were what we call mention variants: descriptive nouns, partial names, and contextual references.

Mapping the variant spectrum

We categorised the unexpected citations into four types. First, descriptive substitutes: 'the Munich AI studio' instead of the company name. Second, domain truncations: dropping legal entity suffixes or regional qualifiers. Third, acronym expansions we never officially used. Fourth, competitor adjacency mentions.

Descriptive substitutes alone made up 27 percent of citations for one B2B client. The model read their llms.txt file, understood their value proposition, and cited them by function rather than name. If your tracker only watches the trademark, you report zero impact.

Building a variant-aware tracker

You need a detection pipeline that tolerates semantic distance. We run retrieved AI answers through a lightweight classifier that tags mention type. It flags exact matches, partial matches, and functional references. This requires an initial manual labelling effort of roughly 200 examples per brand.

The classifier operates on sentence chunks, not full responses. We found that citation context windows rarely exceed 40 tokens. Scanning the entire answer degrades precision because background sections mention related industries without actually attributing the source to the brand.

The llms.txt illusion

Adding a llms.txt file felt like a quick fix for citation consistency. In practice, it shifted the variant distribution rather than eliminating it. Models that read the file still paraphrased the brand name 44 percent of the time, but they did adopt the preferred spelling when using the exact name.

The file is more useful for controlling factual accuracy than enforcing naming. If your llms.txt states you were founded in 2019, models generally respect that. If it demands they call you 'NeoLinks', they often ignore it and default to the most natural sounding variant.

Measurement cadence and drift

GEO measurement is not a one-off audit. Citation behaviour drifts as model weights update and retrieval corpora shift. We re-run variant tracking weekly. Over eight weeks, we observed a 14 percent swing in exact match rates for one client after a minor ChatGPT update.

This volatility makes monthly reporting dangerous. A single snapshot can flatter or panic you. Weekly sampling across 50 queries per topic cluster gives a stable baseline. Anything less and you are just reading noise from the model's sampling variance.

Start by expanding your query set beyond branded terms. Track how models describe the solution category you operate in, because that is where descriptive substitutes appear. Build a simple classifier for mention variants, accept that exact matches will always be a minority, and report total citation presence instead.

Working on a project where these methods apply?