How to audit your site for AI visibilitySix steps. Half a day. Repeat weekly.
A complete AI visibility audit covers six surfaces: robots.txt access, crawl reachability, metadata completeness, structured-data validity, internal-link topology, and per-engine citation testing. This page walks through each step with the practical method an operator can run in half a day.
By Martin Yarnold · UpdatedThe six-step methodology
| Step | What to check | How |
|---|---|---|
| 1. robots.txt audit | Confirm the log-visible AI crawler UAs are allowed and the robots.txt usage-control tokens are set as intended. | Log-visible HTTP UAs: GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User, Claude-SearchBot, PerplexityBot, Perplexity-User, Applebot, Bingbot, CCBot, Googlebot. Robots.txt usage-control tokens (no HTTP fetcher): Google-Extended (Gemini Apps/Vertex AI training + grounding), Applebot-Extended (Apple Intelligence usage of Applebot-crawled data). |
| 2. Crawl reachability | Every important page returns 200 within 5 seconds. | Test 50–100 key URLs with curl from clean network; check response code and time. Investigate any 5xx, timeout, or 4xx. |
| 3. Metadata audit | Title, meta description, canonical URL all present on each page. | Run a crawler or grep your build output. Score is binary per page — all three present, or not. |
| 4. Structured-data validation | Each page emits valid JSON-LD that validator.schema.org parses. | Run the validator against a sample of 10–20 representative pages. Fix parse errors before chasing schema-type coverage. |
| 5. Internal-link topology | No page has fewer than 3 inbound internal links. | Crawler produces the link graph. Identify orphan pages and link them from the navigation or related-content sections. |
| 6. Per-engine citation testing | Run a fixed 10–30 query set against ChatGPT, Claude, Perplexity, Gemini. | Record per-engine, per-query whether your domain appears in cited sources. Track the rate weekly. |
Frequently asked questions
How long does a full AI visibility audit take?
Roughly half a day for a single competent operator on a 50–500 page site. The robots.txt and schema checks are mechanical and fast (1–2 hours). Crawl reachability and metadata diff take another 2–3 hours depending on tooling. Per-engine citation testing against a 10–30 query set is the longest item and typically takes 2–4 hours of patient manual checking, more if you want to capture screenshots. A repeat audit on the same site takes about half that — the methodology stays the same; you compare against the previous baseline.
What tools do I need?
Minimum viable: a structured-data validator (validator.schema.org), access to server logs, and accounts on each major AI engine (ChatGPT, Claude, Perplexity, Gemini). Recommended additions: a crawler (Screaming Frog, Sitebulb, or similar) for systematic metadata diff, and a citation interrogator like Promagen Sentinel for engine-side measurement automated weekly. The audit can be done entirely manually; tools speed it up but are not strictly required.
What order should I fix issues in?
Availability first, metadata second, schema third, linking last. Reasoning: a page that returns 5xx is invisible regardless of schema. Missing meta description blocks the cleanest answer-card display. Missing schema reduces parse confidence. Orphan-page risk has the smallest weighted impact (10% per the Sentinel composite). Fixing in that order maximises score gain per hour of work, especially when the score weighting is composite (availability 40%, metadata 20%, schema 15%, regression burden 15%, orphan risk 10%).
How often should I re-audit?
Monthly minimum, weekly if AI citations are revenue-relevant. AI engines refresh retrieval indexes on their own cadence, change ranking quietly, and run training cuts that drop sources without warning. A page that passed all checks in January can fail several by April. The structural items (robots, schema, metadata) are reasonably stable week-to-week; the citation measurements drift faster. Promagen Sentinel runs the structural checks weekly automatically; citation measurement is harder to automate fully and benefits from weekly manual review on a fixed query set.
Should I run the audit in-house or hire it out?
First audit in-house — the team learns the methodology and discovers the worst issues. After that, the decision depends on cadence. Monthly: in-house is fine if you have a competent technical SEO. Weekly: in-house gets expensive in attention; a tool like Promagen Sentinel automates the structural side. The hybrid pattern most teams settle on: tool handles structural weekly, human runs deeper review monthly with a focus on content and per-engine citation testing.
When can I stop running the audit?
Never, but the cadence can stretch once the score stabilises above ~85 on the composite for three consecutive months. At that point the marginal returns from weekly audit drop. Monthly is sufficient maintenance. The exceptions: after a major site redesign or platform migration (audit immediately and weekly for two months), after an SEO-platform change (Vercel→Netlify, etc. — audit), or when a new AI engine launches a feature you want to track.