Rotate for Promagen

Promagen is built for landscape viewing. Turn your phone sideways for the best experience.

How ChatGPT decides what to citeWhat OpenAI documents — and what is honestly unknown.

OpenAI does not publish ChatGPT's citation ranking algorithm. What is documented: three distinct bots — GPTBot and OAI-SearchBot as autonomous crawlers that honour robots.txt, ChatGPT-User as a user-triggered fetcher where OpenAI's docs note robots.txt may not apply in the same way — and broad functionality. This page describes the documented surface, separates it from inferred behaviour, and lists the structural signals operators can verifiably influence — without claiming knowledge of internal ranking we do not have.

By Martin Yarnold · Updated
Citation tracking
Sentinel measures observable ChatGPT Search citation behaviour against a fixed query set every Monday and surfaces the trend.
See how Sentinel measures it →

Structural signals operators control

These are the levers that demonstrably affect whether ChatGPT can reach your content. None of them guarantees citation; collectively they remove the most common blockers. Items labelled "OpenAI documented"come from OpenAI's published bot docs. Items labelled "industry-standard"are generic technical-SEO hygiene that no vendor disputes.

SignalSourceWhat it does
GPTBot allowed in robots.txtOpenAI documentedRequired for training-corpus inclusion. Disallowing removes content from future model knowledge.
OAI-SearchBot allowed in robots.txtOpenAI documentedRequired for ChatGPT Search citation eligibility. Disallowing makes your site invisible to Search.
Server 200 response, fastPractical (not vendor-specific)Bots time out faster than Googlebot. Sub-300ms response keeps you in the retrieval set.
Valid JSON-LD on each pageIndustry-standardHelps the engine disambiguate the page's entity. Indirect contributor to retrieval ranking.
Complete title + meta description + canonicalIndustry-standardEngines lift these into answer text and source-card displays. Missing any one degrades citation surface.
Substantive contentIndustry-standardThin pages get retrieved less often. Word-count alone is not the signal — depth + uniqueness is.

Frequently asked questions

Does OpenAI publish ChatGPT's citation ranking algorithm?

No. OpenAI publicly documents three user agents: GPTBot and OAI-SearchBot are autonomous crawlers (training and ChatGPT Search retrieval respectively) that honour robots.txt; ChatGPT-User is a user-triggered fetcher, and OpenAI's bot docs call out that robots.txt rules may not apply to user-initiated requests in the same way. OpenAI does not publish a citation ranking or retrieval scoring algorithm. Anything claimed about "what ChatGPT prioritises" beyond the documented bot-level behaviour is inference from observable output, not vendor contract. Treat ranking-detail claims as best-effort interpretation, not fact.

When does ChatGPT cite sources at all?

Primarily in ChatGPT Search mode and when retrieval-augmented generation is triggered for a query. Pure model-knowledge answers (no web fetch at answer time) typically do not produce inline citations. The transition between "this is a search-triggered answer" and "this is a model-knowledge answer" happens silently in the ChatGPT UI — operators cannot reliably tell which path produced a given answer without testing. The practical implication: optimise so your content is reachable and parseable, and ChatGPT Search will surface citations when the query triggers retrieval.

Which structural signals can operators verifiably influence?

The signals OpenAI documents and that operators control directly: robots.txt access for GPTBot/OAI-SearchBot/ChatGPT-User (allow them), server response health (200 fast), valid JSON-LD structured data, complete metadata (title, description, canonical), and substantive content. These do not guarantee citation, but they remove the most common blockers. Anything beyond — "ChatGPT prefers schema type X over Y", "ChatGPT weights backlinks at Z%" — is inference, not documented.

How many citations per answer does ChatGPT Search typically include?

Observed range is 0–4 citations on retrieval-augmented answers; pure model-knowledge answers show 0. OpenAI does not publish "citations per answer" as a documented contract. Density depends on query intent, retrieval-mode activation, and per-query factors that are not externally observable. Treat 0–4 as a working observation, not a guarantee.

Do paid ChatGPT tiers cite differently than free?

Citation behaviour appears to follow the model and feature set, not the tier directly. Paid tiers get access to different models and features (GPT-4-class, longer context, deeper Search) which can affect citation density. OpenAI does not publish a documented difference in citation algorithm by tier. The safest read: focus on whether your content reaches the retrieval surface at all; citation density between tiers is a secondary concern.

How do I measure ChatGPT citations against my queries?

Pick 10–30 queries your buyers ask, run them in ChatGPT Search weekly, and record per-query whether your domain appears in the cited sources. This is the only direct measurement; OpenAI does not provide a "did you cite me" API. Promagen Sentinel automates this on a fixed query set; the manual version is a structured spreadsheet plus weekly habit. The output is a citation-rate-per-query time series you can correlate with content and structural changes.

Get a free Sentinel snapshot →

OpenAI bot names and robots.txt compliance reference platform.openai.com/docs/bots. ChatGPT, GPTBot, OAI-SearchBot, ChatGPT-User are trademarks of OpenAI. This page does not claim knowledge of OpenAI's internal retrieval or ranking algorithms; structural-signal descriptions are documented or industry-standard technical-SEO hygiene. Promagen Ltd is independent of OpenAI.

provenance: sha256:ea93191116237952