Why classic analytics fails for AI search
Traditional SEO dashboards — Google Analytics 4, Google Search Console, Ahrefs, Semrush — were built around click events. A user searched, saw a result, clicked, landed on your site, maybe converted. Every step left a trail.
AI search breaks that model in three places. First, the user often does not click. They get an answer in the AI response and act on it directly — phoning you, emailing you, or visiting your business. Second, when they do click, the referrer often does not identify the AI source reliably; Google Analytics sees "direct traffic" or a generic chat.openai.com referral. Third, and most critically, not being recommended produces no signal at all. You cannot measure the absence of a click that was never made.
So AI visibility measurement is fundamentally different. It is closer to PR measurement (share of voice, mention quality) than to SEO measurement (rankings, sessions, conversions). The metrics that matter are not the metrics in your existing dashboard.
The six metrics that matter
1 Prompt coverage
The percentage of your target prompts across the four major AI engines that name your company in the response. This is the headline number.
Example: 12 prompts × 4 engines = 48 total tests. Company is named in 18. Coverage = 18 ÷ 48 = 37.5%
Build a list of 10–15 prompts that a prospect in your category would actually type. Mix commercial ("best [category] in [market]"), informational ("how do I choose a [product] for [use case]"), and comparative ("X vs Y"). Run each through ChatGPT, Perplexity, Gemini, and Copilot. Log which name you.
2 Recommendation rank
Not just whether you are mentioned, but where in the list. First-named companies get roughly 3x the attention of third-named companies. Track the position of your brand in each prompt where you are mentioned.
Example: Mentioned 1st in 4 prompts, 2nd in 6, 3rd in 2.
Score = 4(1/1) + 6(1/2) + 2(1/3) = 4 + 3 + 0.67 = 7.67
The weighted-rank score penalizes lower positions correctly. Moving from average position 2.4 to 1.8 is a meaningful improvement even if coverage stays flat.
3 Citation source diversity
When AI engines cite their sources for recommending you, how many distinct source domains appear? A company cited only from its own website is fragile — if its ranking changes, the AI loses its only source and drops the recommendation. A company cited from 5 different domains (own site, Trustpilot, industry press, Reddit, Wikipedia) has a robust moat.
Example: Across 12 prompts where you are mentioned, engines cite 20 URLs total.
If those come from 8 unique domains: Diversity = 8 ÷ 20 = 0.40
Anything under 0.20 means you are over-relying on one or two sources (usually your own site). Above 0.40 is solid. Above 0.60 is a defensible moat.
4 Share of voice (SoV)
In prompts where at least one company is mentioned, what percentage of mentions across all engines go to you vs. your competitors? Share of voice is the single clearest signal of category position.
Example: 48 total tests. You are mentioned 18 times. Top competitor 22 times. Other competitors 35 times combined.
SoV = 18 ÷ (18 + 22 + 35) = 18 ÷ 75 = 24%
Track SoV against your top 3 competitors specifically. If your total SoV is rising but competitor SoV is rising faster, you are losing ground even as your absolute numbers improve.
5 Attributed new business
The ground-truth metric. How many new inquiries, leads, demos, or customers arrived explicitly because AI search named you? Capture this via two mechanisms.
Intake form self-reporting: Add "How did you hear about us?" with options including "ChatGPT / OpenAI," "Perplexity," "Gemini / Google AI," "Copilot / Microsoft AI," "Claude," and a free-text "Other AI tool." Not perfect — people forget, misattribute, or check the first option — but directionally correct over enough volume.
CRM citation column: A column in your CRM called "AI-attributed" (boolean or source name). Require sales to fill it on every new inbound during the first qualifying call. Over 60–90 days you will build enough data to see the trend.
6 Brand mention consistency
When AI engines describe your company, do they describe you correctly? This is a qualitative metric but tracks a real problem: AI engines can hallucinate details, describe you with outdated positioning, or conflate you with a similarly-named competitor. Check the accuracy of what engines say about your offering, pricing, markets, and team.
Example: 18 total mentions. 14 describe you accurately. 3 describe you with outdated positioning. 1 conflates you with a competitor.
Accuracy Rate = 14 ÷ 18 = 77.8%
Aim for 90%+ accuracy. Errors usually come from inconsistent positioning across your website, press, and directories. Fix the source of truth, and mentions align over 4–8 weeks as engines refresh their data.
The monthly measurement cadence (20 minutes)
Once you have baselined, running the monthly cadence takes about 20 minutes for a company with 12 prompts and 3 competitors tracked.
| Step | Action | Time |
|---|---|---|
| 1. Run prompts | 12 prompts × 4 engines. Log output in a sheet with columns: engine, prompt, companies mentioned (in order), cited URLs. | 10 min |
| 2. Calculate metrics | Prompt coverage, weighted rank score, SoV vs top 3 competitors, citation diversity. | 4 min |
| 3. Check accuracy | For 3 random prompts where you are mentioned, read the description carefully. Is it accurate? | 3 min |
| 4. Pull attribution | Export CRM new-inbound data, count AI-attributed and total. Calculate percent. | 2 min |
| 5. Log and compare | Paste the 4 quantitative metrics + 1 qualitative into a running sheet. Compare to last month. | 1 min |
Tools that speed this up
Several platforms now automate the prompt-running part: Profound, Otterly.AI, AthenaHQ, and Peec.AI run scheduled prompts across major AI engines and log mentions automatically. Costs range from $99 to $500 per month for SMB tiers. They typically cut the "run prompts" step from 10 minutes to 30 seconds of dashboard review.
That said, the interpretation is still manual. Tools show you that your prompt coverage dropped from 42% to 37%; they do not tell you why. The value of the monthly review is not the data collection — it is the 5 minutes you spend looking at the numbers and asking, "What changed in the last 30 days?"
What to do with the data
Once you have 3 months of data, the dashboard tells you exactly where to invest.
If prompt coverage is flat but citation diversity is rising: You are building a moat. Stay the course on third-party mentions. Coverage will follow in 60–90 days.
If prompt coverage is rising but citation diversity is low: You are winning on your own content — fragile. Prioritize earned mentions on authoritative sources.
If SoV vs top competitor is dropping even as your coverage rises: Your competitor is growing faster. Audit what they have shipped in the last 90 days and find gaps you can own.
If attribution is zero despite strong coverage: The prompts you are tracking do not reflect what your customers actually ask. Re-interview 5 recent customers about what they searched and rebuild the prompt list.
Frequently Asked Questions
How do I know if my AI visibility is actually improving?
Track six metrics monthly: prompt coverage, recommendation rank, citation source diversity, share of voice, attributed new business, and brand mention consistency. If prompt coverage and citation diversity are both trending up quarter over quarter, you are making real progress. A single-metric win (only reviews up, only mentions up) is not enough — AI engines score you on the full pattern.
How often should I measure AI visibility?
Baseline once, then measure monthly for prompt coverage and citation sources. Review the full dashboard quarterly. AI engines update their underlying data every few weeks, so weekly measurement adds noise without adding signal. Monthly is the right cadence for most companies.
Can I automate AI visibility tracking?
Partially. Tools like Profound, Otterly, and AthenaHQ can auto-run prompts against ChatGPT, Perplexity, Gemini, and Claude and log which companies are mentioned. They automate the data collection part but not the interpretation — a human still needs to spot patterns and decide what to work on next.
What is a good prompt coverage percentage?
For a mid-market company in a competitive category, 40–60% is good, 60–80% is excellent, and 80%+ is dominant. For local businesses or niche B2B segments the benchmarks shift higher (50–75% is achievable) because there is less competition. Start with a baseline and track relative improvement rather than chasing an absolute number.
Does AI visibility correlate with revenue?
Yes, but not perfectly. The strongest correlation is between citation source diversity (how many different sources the engine pulls from when recommending you) and attributed new business. Prompt coverage without diversity is fragile — a single algorithm update can drop you out. Diverse citation sources are a stable moat.
Want us to baseline your six metrics for you?
Our free AI scan delivers prompt coverage, share of voice, and citation source diversity across all four major engines.
Request your free scan →