{"pageUrl":"https://promagen.com/can-ai-engines-find-my-website","lastModified":"2026-05-10","provenanceHash":"sha256:0019f22d4e115db13035941a9f9b72f0be97b53cfb7048766c380baf0455a01b","provenanceNote":"Bot user-agent names are sourced from each vendor's published crawler documentation. Robots.txt usage-control tokens (Google-Extended, Applebot-Extended) are distinguished from log-visible HTTP user agents per Apple and Google docs. Answer-surface test methods describe behaviour observable in each engine's product UI as of 2026-05-10. No claim is made about any engine's internal retrieval or citation algorithm.","claims":[{"id":"claim-log-visible-bot-list","statement":"Log-visible HTTP user agents from the major AI engines — GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User, Claude-SearchBot, PerplexityBot, Perplexity-User, Applebot, Bingbot, CCBot, and Googlebot — are documented by their respective vendors and can be verified by server-log inspection.","evidenceUrl":"https://promagen.com/robots.txt","lastVerified":"2026-05-10","hash":"sha256:fc79ebdfb968777973adca21ce8dad1a26a743248d8d5b778269ed16ede804b7"},{"id":"claim-robots-txt-only-tokens","statement":"Google-Extended and Applebot-Extended are robots.txt usage-control tokens, not log-visible HTTP user agents. Apple documents Applebot-Extended as a control for how Apple may use data crawled by Applebot. Google documents Google-Extended as a standalone product token controlling whether crawled content may be used for Gemini Apps / Vertex AI Gemini training and grounding. Looking for either string in access logs returns zero hits even when the corresponding AI usage is allowed.","evidenceUrl":"https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers","lastVerified":"2026-05-10","hash":"sha256:03e7bbe302fd29ef72fcd0f0c07143a5d959b7540204c5418ef41c182bb8b374"},{"id":"claim-three-layer-test-approach","statement":"A complete AI engine reachability check has three layers: server-log inspection for log-visible bot user agents (proves crawl), in-product URL fetch with summary request (proves parse), and citation queries targeted at the domain (proves retrieval ranking). Each layer answers a different question; all three are needed for a full coverage check.","evidenceUrl":"https://promagen.com/sentinel/weekly","lastVerified":"2026-05-10","hash":"sha256:11eac0fc439c31c8ff5902ab69a9f7333c7b26d50dde391a8f245c06eae22ce2"}]}