{"pageUrl":"https://promagen.com/state-of-ai-citations-2026","lastModified":"2026-05-10","provenanceHash":"sha256:f5271aa5330194732afdbecfe01ef9814f0a44b6ee489abb9e4e81d906b04c8d","provenanceNote":"Vendor documentation references for ChatGPT, Claude, Perplexity, and Gemini are sourced from each vendor's published crawler and product docs as of 2026-05-10. Observed behaviour describes Sentinel's per-query measurement during 2026 against a fixed query set; observed numbers are operational patterns, not vendor-guaranteed contracts. No claim is made about any engine's internal retrieval or citation algorithm.","claims":[{"id":"claim-perplexity-citation-contract","statement":"Perplexity's answer interface displays numbered source citations on every answer as a product design contract; citations are visible by default in the standard Perplexity UI.","evidenceUrl":"https://www.perplexity.ai","lastVerified":"2026-05-10","hash":"sha256:68d5cd3feded44f2589876981ed8592031f3076e9b1c916a445fe1af76023deb"},{"id":"claim-openai-bot-bucket-distinction","statement":"OpenAI publicly documents two autonomous crawlers — GPTBot (model training) and OAI-SearchBot (ChatGPT Search retrieval) — that honour robots.txt allow/disallow rules. ChatGPT-User is a user-triggered fetcher invoked by user action in ChatGPT; OpenAI's bot documentation calls out that robots.txt rules may not apply to user-initiated requests in the same way as to autonomous crawling.","evidenceUrl":"https://platform.openai.com/docs/bots","lastVerified":"2026-05-10","hash":"sha256:d46865d51a9f1b0fe27d2287123f22080f03a688c8f2c181cc911299daacdcae"},{"id":"claim-perplexity-bot-bucket-distinction","statement":"Perplexity publicly documents PerplexityBot as its autonomous citation indexer, controlled by robots.txt allow/disallow directives. Perplexity-User is a user-triggered fetcher; Perplexity has stated that Perplexity-User generally does not treat robots.txt as binding because the request is user-initiated.","evidenceUrl":"https://docs.perplexity.ai/guides/bots","lastVerified":"2026-05-10","hash":"sha256:a9ee5328d89cdd2d1bfab490e1805b732832a893df108788038b85980f52e8d6"},{"id":"claim-google-extended-scope","statement":"Google documents Google-Extended as a robots.txt usage-control token (not a log-visible user agent) that controls whether crawled content may be used for Gemini Apps and Vertex AI Gemini training and grounding. The crawling itself is performed by Googlebot. Disallowing Google-Extended does not remove pages from Google Search; AI Overviews and AI Mode in Search are controlled separately via Googlebot plus preview controls such as nosnippet, data-nosnippet, max-snippet, and noindex.","evidenceUrl":"https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers","lastVerified":"2026-05-10","hash":"sha256:197f611e17427390f96ef2d51dfc8c532cfec85c4ef8ad0557952778edadcaeb"},{"id":"claim-no-vendor-publishes-ranking","statement":"None of the four major AI engines (OpenAI / ChatGPT, Anthropic / Claude, Perplexity, Google / Gemini) publishes a complete citation ranking algorithm. Specific claims about ranking weights, freshness scoring, or backlink influence on AI citation are inference from observable output, not vendor contract.","evidenceUrl":"https://promagen.com/sentinel/weekly","lastVerified":"2026-05-10","hash":"sha256:bfa9133823868167bca6d9123c68fde6bfc986b25cc95075a0052a1e487a8a4b"}]}