Measuring AI Visibility: The Six Metrics That Matter in 2026

Why classic analytics fails for AI search

Traditional SEO dashboards — Google Analytics 4, Google Search Console, Ahrefs, Semrush — were built around click events. A user searched, saw a result, clicked, landed on your site, maybe converted. Every step left a trail.

AI search breaks that model in three places. First, the user often does not click. They get an answer in the AI response and act on it directly — phoning you, emailing you, or visiting your business. Second, when they do click, the referrer often does not identify the AI source reliably; Google Analytics sees "direct traffic" or a generic chat.openai.com referral. Third, and most critically, not being recommended produces no signal at all. You cannot measure the absence of a click that was never made.

So AI visibility measurement is fundamentally different. It is closer to PR measurement (share of voice, mention quality) than to SEO measurement (rankings, sessions, conversions). The metrics that matter are not the metrics in your existing dashboard.

The six metrics that matter

1 Prompt coverage

The percentage of your target prompts across the four major AI engines that name your company in the response. This is the headline number.

Prompt Coverage = (# prompts where your company is named) ÷ (# total prompts tested) × 100

Example: 12 prompts × 4 engines = 48 total tests. Company is named in 18. Coverage = 18 ÷ 48 = 37.5%

Build a list of 10–15 prompts that a prospect in your category would actually type. Mix commercial ("best [category] in [market]"), informational ("how do I choose a [product] for [use case]"), and comparative ("X vs Y"). Run each through ChatGPT, Perplexity, Gemini, and Copilot. Log which name you.

Benchmark: Mid-market B2B: 30–60% is typical after 90 days of GEO work. 60–80% is excellent. Local services: benchmarks run higher (50–75%) due to less competition.

2 Recommendation rank

Not just whether you are mentioned, but where in the list. First-named companies get roughly 3x the attention of third-named companies. Track the position of your brand in each prompt where you are mentioned.

Weighted Rank Score = Σ (1 ÷ position) across all mentions

Example: Mentioned 1st in 4 prompts, 2nd in 6, 3rd in 2.
Score = 4(1/1) + 6(1/2) + 2(1/3) = 4 + 3 + 0.67 = 7.67

The weighted-rank score penalizes lower positions correctly. Moving from average position 2.4 to 1.8 is a meaningful improvement even if coverage stays flat.

Benchmark: A mid-competitive category with 18 mentions should aim for a weighted rank score above 10 within 6 months. Companies scoring below 5 are rarely the first mention in any prompt.

3 Citation source diversity

When AI engines cite their sources for recommending you, how many distinct source domains appear? A company cited only from its own website is fragile — if its ranking changes, the AI loses its only source and drops the recommendation. A company cited from 5 different domains (own site, Trustpilot, industry press, Reddit, Wikipedia) has a robust moat.

Citation Diversity = unique source domains ÷ total citations

Example: Across 12 prompts where you are mentioned, engines cite 20 URLs total.
If those come from 8 unique domains: Diversity = 8 ÷ 20 = 0.40

Anything under 0.20 means you are over-relying on one or two sources (usually your own site). Above 0.40 is solid. Above 0.60 is a defensible moat.

Benchmark: Companies that plateau at 40% prompt coverage usually do so because their diversity score is under 0.25. Invest in third-party mentions before pushing for higher coverage.

4 Share of voice (SoV)

In prompts where at least one company is mentioned, what percentage of mentions across all engines go to you vs. your competitors? Share of voice is the single clearest signal of category position.

Share of Voice = (your mentions) ÷ (your mentions + all competitor mentions) × 100

Example: 48 total tests. You are mentioned 18 times. Top competitor 22 times. Other competitors 35 times combined.
SoV = 18 ÷ (18 + 22 + 35) = 18 ÷ 75 = 24%

Track SoV against your top 3 competitors specifically. If your total SoV is rising but competitor SoV is rising faster, you are losing ground even as your absolute numbers improve.

Benchmark: 10–20% SoV is a respectable second-tier position. 25–40% is a strong challenger. 40%+ is category dominance. Most companies start at 1–5% when they first baseline.

5 Attributed new business

The ground-truth metric. How many new inquiries, leads, demos, or customers arrived explicitly because AI search named you? Capture this via two mechanisms.

Intake form self-reporting: Add "How did you hear about us?" with options including "ChatGPT / OpenAI," "Perplexity," "Gemini / Google AI," "Copilot / Microsoft AI," "Claude," and a free-text "Other AI tool." Not perfect — people forget, misattribute, or check the first option — but directionally correct over enough volume.

CRM citation column: A column in your CRM called "AI-attributed" (boolean or source name). Require sales to fill it on every new inbound during the first qualifying call. Over 60–90 days you will build enough data to see the trend.

Benchmark: Mid-market B2B companies with strong GEO work see 5–15% of new inbound attribute to AI sources by month 4, rising to 15–30% by month 12. Local services see higher percentages faster (10–25% by month 3) because the buyer journey is shorter.

6 Brand mention consistency

When AI engines describe your company, do they describe you correctly? This is a qualitative metric but tracks a real problem: AI engines can hallucinate details, describe you with outdated positioning, or conflate you with a similarly-named competitor. Check the accuracy of what engines say about your offering, pricing, markets, and team.

Mention Accuracy Rate = (mentions with accurate facts) ÷ (total mentions) × 100

Example: 18 total mentions. 14 describe you accurately. 3 describe you with outdated positioning. 1 conflates you with a competitor.
Accuracy Rate = 14 ÷ 18 = 77.8%

Aim for 90%+ accuracy. Errors usually come from inconsistent positioning across your website, press, and directories. Fix the source of truth, and mentions align over 4–8 weeks as engines refresh their data.

Benchmark: Below 70% accuracy often signals that your website and press do not tell a consistent story. Below 50% may indicate you are being confused with another company — prioritize brand disambiguation work.

The monthly measurement cadence (20 minutes)

Once you have baselined, running the monthly cadence takes about 20 minutes for a company with 12 prompts and 3 competitors tracked.

Step	Action	Time
1. Run prompts	12 prompts × 4 engines. Log output in a sheet with columns: engine, prompt, companies mentioned (in order), cited URLs.	10 min
2. Calculate metrics	Prompt coverage, weighted rank score, SoV vs top 3 competitors, citation diversity.	4 min
3. Check accuracy	For 3 random prompts where you are mentioned, read the description carefully. Is it accurate?	3 min
4. Pull attribution	Export CRM new-inbound data, count AI-attributed and total. Calculate percent.	2 min
5. Log and compare	Paste the 4 quantitative metrics + 1 qualitative into a running sheet. Compare to last month.	1 min

Tools that speed this up

Several platforms now automate the prompt-running part: Profound, Otterly.AI, AthenaHQ, and Peec.AI run scheduled prompts across major AI engines and log mentions automatically. Costs range from $99 to $500 per month for SMB tiers. They typically cut the "run prompts" step from 10 minutes to 30 seconds of dashboard review.

That said, the interpretation is still manual. Tools show you that your prompt coverage dropped from 42% to 37%; they do not tell you why. The value of the monthly review is not the data collection — it is the 5 minutes you spend looking at the numbers and asking, "What changed in the last 30 days?"

What to do with the data

Once you have 3 months of data, the dashboard tells you exactly where to invest.

If prompt coverage is flat but citation diversity is rising: You are building a moat. Stay the course on third-party mentions. Coverage will follow in 60–90 days.

If prompt coverage is rising but citation diversity is low: You are winning on your own content — fragile. Prioritize earned mentions on authoritative sources.

If SoV vs top competitor is dropping even as your coverage rises: Your competitor is growing faster. Audit what they have shipped in the last 90 days and find gaps you can own.

If attribution is zero despite strong coverage: The prompts you are tracking do not reflect what your customers actually ask. Re-interview 5 recent customers about what they searched and rebuild the prompt list.

Frequently Asked Questions

How do I know if my AI visibility is actually improving?

Track six metrics monthly: prompt coverage, recommendation rank, citation source diversity, share of voice, attributed new business, and brand mention consistency. If prompt coverage and citation diversity are both trending up quarter over quarter, you are making real progress. A single-metric win (only reviews up, only mentions up) is not enough — AI engines score you on the full pattern.

How often should I measure AI visibility?

Baseline once, then measure monthly for prompt coverage and citation sources. Review the full dashboard quarterly. AI engines update their underlying data every few weeks, so weekly measurement adds noise without adding signal. Monthly is the right cadence for most companies.

Can I automate AI visibility tracking?

Partially. Tools like Profound, Otterly, and AthenaHQ can auto-run prompts against ChatGPT, Perplexity, Gemini, and Claude and log which companies are mentioned. They automate the data collection part but not the interpretation — a human still needs to spot patterns and decide what to work on next.

What is a good prompt coverage percentage?

For a mid-market company in a competitive category, 40–60% is good, 60–80% is excellent, and 80%+ is dominant. For local businesses or niche B2B segments the benchmarks shift higher (50–75% is achievable) because there is less competition. Start with a baseline and track relative improvement rather than chasing an absolute number.

Does AI visibility correlate with revenue?

Yes, but not perfectly. The strongest correlation is between citation source diversity (how many different sources the engine pulls from when recommending you) and attributed new business. Prompt coverage without diversity is fragile — a single algorithm update can drop you out. Diverse citation sources are a stable moat.

Want us to baseline your six metrics for you?

Our free AI scan delivers prompt coverage, share of voice, and citation source diversity across all four major engines.

Request your free scan →

← Back to the blog