How to Measure AI Share of Voice (+ 3 Tools)

Last Updated on March 6, 2026 by Alex Birkett

When someone asks ChatGPT, Gemini, or Perplexity for the best tool in your category, you’re either in the answer or you’re invisible.

AI share of voice is the metric that quantifies this over time, against your competitors, and across a large set of prompts related to your category.

And while most marketers I talk to have heard of it, very few are measuring it well. The ones who do have a very good mirror that speaks to their digital visibility (and probably a good understanding of actions that will improve it).

I run Omniscient, an organic growth agency, and this metric has become central to how we think about visibility for our clients. It hasn’t replaced search rankings or classic SEO metrics, but it captures something rankings and traffic can’t – your gestalt web presence within a category as well as your sentiment and messaging accuracy. It encompasses your SEO work, but also signals beyond your own SEO.

The AI share of voice formula is simple.

The hard part is twofold:

Building a prompt set that reflects how your customers think and speak
Identifying and prioritizing actions to improve AI share of voice.

This piece will cover how to calculate share of voice in AI engines, and most importantly, how to establish a set of prompts that will give you useful information.

What Is AI Share of Voice?

AI share of voice measures the percentage of AI-generated responses that mention, recommend, or cite your brand across a defined set of prompts, relative to your competitors.

Traditional share of voice tracked ad spend, media mentions, or organic search visibility (share of search being a specific variant of SEO share of voice).

AI share of voice tracks something different: how often AI systems include your brand when answering the questions your buyers are asking.

The distinction matters because AI search is zero-sum in a way traditional search never was.

In Google’s organic results, ten brands could rank on page one (more if you include ads). In an AI-generated answer, maybe 3-5 get mentioned. Often just one gets the primary recommendation. When your competitor gains share, you lose it: there’s no expanding the page (for now at least).

And the shift is happening faster than most teams realize.

Zero-click searches now account for a significant portion of Google queries. AI Overviews are appearing on more and more SERPs. ChatGPT, Perplexity, and Gemini are becoming the first stop for product research, especially in B2B.

If your content strategy is still optimized exclusively for ten blue links, you’re optimizing for a shrinking surface area.

AI Share of Voice Formula

The basic share of voice calculation:

AI Share of Voice = (Number of AI responses mentioning your brand ÷ Total AI responses for your prompt set) × 100

If you track 1500 prompt outputs across ChatGPT, Gemini, and Perplexity, and your brand appears in 300 of those responses, your AI share of voice is 20%.

Technically, you could measure this a few different ways (and different GEO tools have different ways of calculating SOV).

Entity-based SOV counts how often your brand appears as a named recommendation (e.g., “I’d recommend Omniscient Digital for enterprise SEO”). Citation-based SOV counts how often your content is cited as a source in the AI’s response. We call the latter “citation share,” not “share of voice,” because the citation is often invisible to the user (though does apparently influence the answer).

Both matter, but entity-based SOV tends to be the more actionable metric for most B2B teams because it maps directly to whether you’re being recommended.

The Formula is Easy. The Prompt Set is Everything.

Now, here’s where most teams go wrong: they nail the formula and completely botch the prompt set.

The prompt set IS the measurement. It determines what you’re measuring, who you’re measuring it for, and whether the results actually correlate with business outcomes. I wrote an entire essay about this titled “how measurement changes behavior.”

You could feasibly run a set of 100 prompts, all of which include your brand name in it – such as “tell me why Omniscient is the best agency in the world.” And lo and behold, your share of voice will be nearly perfect.

But that’s likely not representative of your target audience or what they are searching.

You could go the other direction and take search keywords related to your industry – things like “SEO,” “content marketing,” or “what is AI search.”

In these, you may have a high citation share, but nearly zero brand mentions in the outputs. So this is also not the most useful representation of your AI share of voice. Your brand’s visibility in the outputs is what drives the most value.

What you track matters.

I’ve seen teams build prompt sets by brainstorming in a conference room, or worse, by dumping their top keywords into a prompt template. Both approaches produce a metric that looks rigorous but tells you almost nothing about your actual visibility to buyers.

The right approach starts with your customers’ words, maps to your product and category entry points, and is calibrated through search data.

How to Build a Prompt Set That Actually Means Something

Think of prompt set construction like building a survey instrument. The questions you include determine the validity of your results. Bias the questions, and you bias the measurement.

Your prompt set should be sourced from three layers, each one expanding and validating the last:

Baseline Brand and Positioning
Voice of Customer
Search or Channel Data

1. Start with Baseline Brand and Positioning Details

First and foremost, you should outline which categories you compete within, what your core features and differentiators are, who you compete with, and generally how you want to be known or associated.

These are foundational brand and product marketing activities that define the arena or the terrain in which you hope to win.

For instance, as an agency, I can compete in a few categories of different hierarchical value:

Professional services
Marketing agencies
Digital marketing agencies
Organic marketing agencies
SEO / content / digital PR agencies
Technical SEO / thought leadership content / media relations agencies
Agencies in NYC, Chicago, Boston, Austin
Technical SEO agencies based in NYC that work with global enterprise B2B software companies and have localization and content translation capabilities

You get the point. How you define your market dictates which terms, queries, keywords, or prompts will matter to you. And relevance is everything.

In other words, even if a potential customer is looking for “thought leadership content,” if I do not serve that nor do I want to compete in that category, it’s foolish and a waste of my time to track that. So before you layer on voice of customer research, you first have to ground everything in your own business context to calibrate for relevance.

2.1 Layer on First Party Voice of Customer

Your best source of prompts is the language your buyers already use. Not what you think they’re asking, but what they’re actually asking.

Pull from sales call transcripts, demo recordings, support tickets, onboarding calls, and win/loss interviews. Look for the questions prospects ask before they buy. These tend to follow predictable patterns:

“What’s the best way to [solve problem]?”
“What tools do people use for [use case]?”
“[Your brand] vs. [competitor] – which is better for [specific need]?”
“How does [category] work for [industry/company size]?”

These are the types of queries people are now typing into ChatGPT. Much more personalized and long tail.

This is really the step in which we move from keywords or short tail phrases into optimizing for AI visibility. In traditional SEO, you’d likely target a strong short tail phrase like “best CRM,” and while that is still important, AI drive platforms have trained people to add a lot more color and depth to their queries.

Additionally, AI powered search has a large element of personalization and memory, so even if a user doesn’t explicitly state “I am a startup founder looking for X, Y, and Z,” ChatGPT and other tools will infer this. Using voice of customer data to select and prioritize your prompts makes this type of intent or query more explicit, which allows you to gauge your brand’s visibility against real questions, not just keywords.

You can collect this data in so many ways. If you have a product-led motion, mine your in-app feedback, feature requests, and trial-to-paid conversion surveys. If you’re sales-led, your CRM notes and call recordings are gold.

The closer you get to the words your buyers use when they’re actively evaluating solutions, the more valid your prompt set becomes.

2.2 Mine Communities and Forums

Your first-party data has a blind spot: it only captures people who already found you.

Communities and forums surface the questions people ask *before* they know you exist.

Reddit is the most valuable source here, and it’s doubly important because AI models draw heavily from Reddit in their training data and retrieval pipelines. The questions people ask on r/marketing, r/SaaS, r/startups, or whatever niche subreddits serve your category are often the same questions they’ll ask ChatGPT, sometimes verbatim.

Beyond Reddit: G2 and Capterra review threads (especially the “what do you wish this product did?” answers), industry Slack communities, Discord servers, Quora, and niche forums. Look for three types of queries:

Comparison queries: “Has anyone used X vs. Y? Which is better for [use case]?”
Best-for queries: “What’s the best [tool/service] for [specific constraint]?”
Problem-solution queries: “We’re struggling with [problem]. What are teams doing to fix this?”

Don’t just grab the questions as is. Rewrite them as natural AI prompts. “Has anyone used Semrush vs. Ahrefs for content audits?” becomes “What’s the best SEO tool for running content audits?” or “Semrush vs Ahrefs for content audits.”

The intent transfers; the format adapts to how people query AI.

3. Triangulate Against Search Data

VOC and community research give you a very high precision of detail around the questions your customers and prospects are actually asking. Search data (and more recently, prompt data) helps you quantify, segment, and prioritize your brand presence and share of voice tracking.

Use keyword research to validate and expand your prompt set (not to replace it). Pull from three sources:

Keywords: Your existing keyword tracking data, plus competitor keyword analysis. Filter for informational and commercial-investigation intent. These map most directly to the types of queries AI engines answer.
People Also Ask (PAA): Google’s PAA boxes are essentially a map of related questions around a topic. They reveal adjacent queries you didn’t surface in VOC research. If your first-party data shows buyers asking “what’s the best project management tool for agencies?”, PAA might surface “how do agencies manage client projects?” or “what’s the difference between project management and client management tools?”
Query fan-outs and autocomplete: Google’s related searches and autocomplete suggestions show how queries branch and narrow. They’re useful for finding long-tail variations and use-case-specific prompts that your VOC data might miss.

The search data layer is a calibration step. It ensures your prompt set isn’t just accurate to your existing customers but also covers the broader landscape of questions buyers ask across the entire journey.

It’s also important because AI search platforms are still fundamentally rooted in search engines, and there’s obviously a high degree of overlap tactically between SEO and AEO. it wouldn’t make sense at this point to look at AI search platforms in isolation without also looking at classic search engines and keywords.

Structure Your Final Prompt Set

Once you’ve sourced prompts from all three layers, organize them:

By product or category entry point. We segment our prompts into different categories for our agency and for most clients. Unless you are a very specialized dedicated point solution, you likely have a few different categories or features that people could search for. We split, for example, into “digital PR,” “SEO,” “Content,” “AI search,” “Analytics,” and “CRO.” This is the most important classification because we can gauge our brand versus industry competitors in a specific category, the closest approximation to true share of voice analysis.
By funnel stage. Awareness prompts (“what is AI search?”), consideration prompts (“best AI search agencies for B2B SaaS”), and decision prompts (“Omniscient Digital pricing” or “Omniscient Digital vs. [competitor]”). Your share of voice will vary a ton by stage, with more brand mentions and brand presence in the bottom of the funnel. But that doesn’t mean you should ignore earlier stages of the funnel, as there are often distinct opportunities to educate the market around a problem.
By persona, if relevant. A CTO and a VP of Marketing ask different questions about the same product category. If your buyers span multiple personas, your prompt set should reflect that. Some GEO tools (Scrunch, in particular) let you segment visibility by persona, which makes this segmentation actionable.

Aim for 100-200 prompts to start. For massive global brands, we’re tracking thousands. For startups, we may struggle to hit 100. But you want enough prompts to get trustworthy directional data and signal.

Also, audit and refresh quarterly. Align with your VOC collection cycles. Buyer language evolves, new competitors emerge, and AI models update their training data. A stale prompt set measures a market that no longer exists.

Measuring AI Share of Voice Across Platforms

Once your prompt set is dialed in, run it across the AI platforms that matter for your audience: ChatGPT, Gemini, Perplexity, Claude, and Google AI Overviews at a minimum.

Though brand presence and mention frequency is your closest way to accurately estimate AI share of voice, there are other metrics that can add more depth to your brand’s AI share:

Position in answer. Being the first brand mentioned in a recommendation list is meaningfully different from being the fifth. Some tools track this. I actually don’t love this metric because the variance is stupidly high, so there’s not much signal here or actionable takeaways.
Sentiment. “Industry leader in X” and “popular but controversial option” are both AI platform mentions, but they’re qualitatively different. Sentiment tracking tells you whether AI is recommending you enthusiastically, neutrally, or with caveats.
Citation vs. entity mention. Is the AI citing your blog post as a source, or recommending your product as a solution? Both contribute to visibility, but entity mentions (being recommended) tend to be higher value than citations (being referenced). Again, we categorize these metrics differently, the latter being “citation share,” which is a good leading indicator for your influence on the conversation.
Platform-level variance. This is the one that surprises most teams. You might hold 35% share of voice in ChatGPT and 8% in Perplexity. The models draw from different data sources, weight signals differently, and update on different cycles. Measuring an aggregate number without platform breakdowns hides actionable insight.

For example, here’s Omniscient’s brand visibility in Gemini:

And here it is in ChatGPT:

Our share of voice drops quite a bit within ChatGPT, and given its prominence as an AI powered search platform, we may want to dedicate more resources to those citation sources.

Consider weighting platforms by your audience’s actual usage. If 70% of your buyers use ChatGPT and 5% use Claude, a weighted SOV gives you a more accurate picture than treating every platform equally.

AI SOV Benchmarks:What Good Looks Like

Caveat: I hate benchmarks. They are nearly useless as they strip away context and aggregate metrics across different business models. This is true for conversion rate (a good conversion rate is one that is better than last month’s), but with AI share of voice, you’re completely in control of the prompts you track, so merely by choosing different prompts, you completely alter your visibility baseline.

Generally, however, if I am exclusively looking at the bottom of the funnel, solution aware prompts related to a category (i.e. “best CRM software for small businesses), I’m bucketing AI search visibility as such:

20-50%: Established player. You’re present but not dominant.
50-90%: Category leader territory. AI consistently includes you in answers.
Below 20%: You have a visibility problem. AI doesn’t know you exist for most queries in your category.

The absolute number matters less than the trajectory. Are you gaining or losing share month-over-month? And against which specific competitors? And also, priorities change at different stages. If you’re under 20%, your efforts should almost certainly be on raising your visibility. At 85% visibility? Marginal gains by improving awareness, but sentiment analysis and sentiment quality (or category expansion) could be high priority.

Track against 6-8 direct competitors, not the entire market. Compare against the brands your buyers actually evaluate alongside you. Your win/loss data tells you who those are.

3 Tools for Measuring AI Share of Voice

You can technically measure this manually by running your prompts through each AI platform, log the results in a spreadsheet, calculate the percentages. But I would not do this. Fidelity in AI search data comes from a large set of prompts and a time series analysis, as it is probabilistic and directional (very different from “rankings”).

Because AI responses aren’t deterministic (the same prompt can produce different results at different times), manual measurement introduces noise that tools can smooth out through repeated sampling.

Here are the three platforms I’d evaluate:

1. Scrunch

Scrunch occupies a unique position because it’s not just a monitoring tool, it also has the Agent Experience Platform (AXP), a technical delivery layer that serves an optimized version of your site to AI crawlers.

On the monitoring side, Scrunch tracks prompt level analytics across multiple AI platforms ChatGPT, Claude, Meta AI, Perplexity, Gemini, Google AI Mode, and AI Overviews. The persona and funnel-stage segmentation is particularly relevant to the prompt set methodology I outlined above. You can see your share of voice broken down by buyer persona and funnel stage, which makes the measurement directly actionable.

The AI bot crawl monitor and page audit tool add a diagnostic layer. If your SOV is low, Scrunch can help you understand whether it’s a content problem (you’re not being cited because your content isn’t good enough) or a technical problem (AI bots can’t crawl your site properly). That distinction matters a lot for knowing where to invest.

Core starts at $250/month. Agency plans from $500/month.

Best for teams that want to measure AND diagnose technical issues blocking AI visibility.

2. Profound

Profound is the category leader, backed by a recent $96 million raise and used by roughly 10% of the Fortune 500. They track across all major AI engines and offer the deepest feature set in terms of analytics, competitive intelligence, and historical trending.

The Starter plan at $99/month is limited to ChatGPT only, which makes it more of a toe-in-the-water option than a real measurement tool. The Growth plan at $399/month gets you multi-engine tracking, which is where Profound actually delivers on its promise.

Best for enterprise teams with budget who want the most comprehensive data set and don’t mind paying for it.

3. Peec AI

Peec AI is what I’d recommend for mid-market teams and agencies that want clean measurement without the complexity. The interface is simpler than Profound’s, setup is fast, and the pricing is transparent: €89/month for Starter, €199/month for Pro, with a 7-day free trial.

Unlimited team seats on all plans is a notable differentiator. Most competitors charge per seat, which gets expensive fast when you want to give visibility to your content team, SEO team, and leadership. Peec tracks across ChatGPT, Perplexity, Google AI Overviews, and other major platforms, with daily updates so you can catch visibility shifts quickly.

The tradeoff is depth. Peec won’t give you the granular analytics or historical trending that Profound offers. But for most teams starting to measure AI SOV, the simpler tool you actually use consistently beats the complex tool that collects dust.

Best for mid-market teams that want fast setup, transparent pricing, and a low learning curve.

Conclusion

AI share of voice is a leading indicator and a very effective mirror for your gestalt digital presence (often you’ll find it closely resembles your market share), not a vanity metric. But only if your prompt set reflects real buying behavior.

Most teams over-index on volume-based prompts or exclusively track BOFU keywords and under-index on intent-rich prompts that actually drive pipeline. They end up with a metric that tells them how visible they are for questions nobody asks when they’re actually making a purchase decision.

Start with your brand and positioning scaffolding and then add in your customers’ words. Expand into the communities where your future customers ask questions. Calibrate against search data so you’re not missing coverage gaps. Then measure consistently, across platforms, and track AI SOV over time.

The teams doing this well right now have a meaningful head start. AI search is still early enough that share of voice is up for grabs in most categories. That window won’t stay open long.