The SaaS Marketer's Guide to Measuring AEO Progress with ChatGPT
If you can't measure your AEO progress, you can't improve it. Here's a practical framework for tracking your AI search visibility in ChatGPT over time.

Most SaaS marketing teams investing in AEO have no idea whether it is working. They publish content, build reviews, and chase editorial placements, then check ChatGPT six months later and feel like nothing changed. The problem is not the tactics. It is the measurement. Without a structured framework for tracking AI visibility, you are flying blind.
Why Standard Marketing Metrics Miss AEO Progress
Your current analytics stack was built for a world where you can track impressions, clicks, and conversions from every channel. AEO does not work that way. When ChatGPT mentions your brand in an answer and a buyer later searches for you directly, that journey does not appear in your attribution model. The brand awareness happened in an AI conversation, not in a trackable click.
This does not mean AEO is unmeasurable. It means you need different metrics: proxy metrics that track the inputs and intermediate outputs of AI visibility rather than direct attribution. The right framework makes progress visible well before it shows up in revenue.
The Three Layers of AEO Measurement
A complete AEO measurement framework for ChatGPT has three layers. Each layer measures something different, and each requires different data collection methods.
Layer 1: Visibility metrics. These measure whether your brand appears in ChatGPT responses for your target queries. They are the closest thing to a "ranking" in AI search.
Layer 2: Accuracy metrics. These measure whether ChatGPT describes your brand correctly and in alignment with your intended positioning. Appearing in ChatGPT with the wrong description can be worse than not appearing at all.
Layer 3: Signal metrics. These measure the upstream inputs that drive AI visibility: review volume, editorial coverage, community mentions, and entity data completeness. Signal metrics are your leading indicators because they change faster than visibility metrics.
Building Your Benchmark Query Set
The core tool for AEO measurement is a benchmark query set: a fixed list of 40-60 queries you run against ChatGPT on a consistent schedule. The query set does not change over time, so you can track your performance on the same questions month after month.
- 1Define Your Query Categories
Your benchmark set should include four types of queries: category queries ("what are the best tools for X"), use-case queries ("how do SaaS companies solve Y"), comparison queries ("X vs Y for Z type of company"), and brand-direct queries ("tell me about [your brand]"). Aim for 10-15 queries per category.
- 2Write Realistic Query Phrasings
Write each query the way a real buyer would type it into ChatGPT, not the way you would write an SEO keyword. Long, natural, conversational phrasing performs differently in AI search than keyword-style queries. Test both phrasings for important topics.
- 3Validate the Query Set Before Locking It
Run all queries once before finalizing. Remove any queries that produce wildly inconsistent or unrelated answers. The goal is a set of queries that reliably test your category and use-case visibility.
- 4Document the Baseline
On your first run, record three things for each query: whether your brand appeared, how it was described (if it appeared), and which competitors appeared. This is your baseline. Every subsequent run is compared against it.
Calculating Your Mention Rate
Mention rate is the percentage of benchmark queries in which your brand appears. It is the single most useful metric for tracking AEO progress over time.
Calculate it by dividing the number of queries where your brand appeared by the total number of queries in your benchmark set. Run your benchmark set once a month and track the trend.
Most SaaS brands start below 15% mention rate on a well-constructed benchmark set. A brand with strong AEO investment should be targeting 35-50% mention rate within 12-18 months.
Accuracy Scoring: Are You Being Described Correctly
Appearing in ChatGPT answers is necessary but not sufficient. You also need to be described accurately. An accuracy score gives you a second dimension of quality beyond raw visibility.
For each query where your brand appears, score the description on three factors:
- โCategory accuracy: does ChatGPT place you in the correct product category?
- โFeature accuracy: are the described capabilities current and correct?
- โPositioning accuracy: is the described buyer profile and use case the one you target?
Score each factor 0-1 and average across all appearances in the month. A 0.8 or higher accuracy score is a healthy target. Below 0.6 means your entity data has significant gaps or inaccuracies to address.
Signal Metrics: Leading Indicators That Predict Future Visibility
Because AI visibility lags behind the signals that drive it by 6-18 months, tracking your signal metrics gives you a forward-looking view of where your visibility is headed.
Track these monthly:
| Signal Metric | Target |
|---|---|
| G2 reviews (cumulative) | Growing by 3-5 per month |
| G2 review recency | At least 50% from last 6 months |
| Editorial roundup features | 2-4 new per quarter |
| Community mentions (Reddit, forums) | 10+ per month in relevant threads |
| Crunchbase/LinkedIn profile completeness | 100% on key fields |
These metrics are controllable right now. You can launch a review campaign this week. You can pitch an editorial roundup this month. Watching signal metrics grow while visibility metrics are still lagging gives you confidence that the AEO investment is compounding correctly.
Example Queries for Your ChatGPT Benchmark Set
what are the best customer success platforms for B2B SaaS companies in 2025
how do SaaS companies automate their customer health scoring
compare [your brand] with [top competitor] for a 50-person SaaS company
what does [your brand] do and what kind of companies use it
Run each of these (adapted to your actual category and brand) monthly and record the results in your tracking spreadsheet.
Building Your AEO Dashboard
You do not need sophisticated tooling to measure AEO. A well-structured spreadsheet is enough to start.
- โOne tab for your benchmark query set with columns for each month's results
- โA mention rate trend line by month
- โAn accuracy score column for each month
- โA competitor mention rate tracker (run the same queries for 2-3 competitors)
- โA signal metrics tab tracking reviews, editorial features, and community mentions
- โA notes column for changes in ChatGPT behavior or description quality
Review this dashboard monthly. Share it with leadership quarterly. It is the only way to prove that your AEO investment is working before it shows up in revenue metrics.
Frequently Asked Questions
How many queries should be in my benchmark set?
40-60 queries is the practical sweet spot. Below 30, your mention rate percentage is too sensitive to individual query variability. Above 80, the monthly testing process becomes too time-consuming without proportionally better data. Start with 40 and expand if you want finer granularity in specific areas.
Should I use the same ChatGPT account for all benchmark queries?
Ideally yes, with fresh conversations (not continuing old threads) for each query. Different ChatGPT accounts can theoretically have different conversation histories that influence responses. Using a dedicated benchmark account with fresh conversations each month gives you the most consistent data.
How do I track competitor mention rate without spending all day in ChatGPT?
Choose 2-3 priority competitors and add their names to your mention rate tracker. When you run your benchmark set for your own brand, log competitor appearances at the same time. You already have the data in front of you. It adds 30% to your tracking time for a significant competitive insight.
What counts as a "mention" in my mention rate calculation?
Your brand name appearing anywhere in the ChatGPT response. Whether it is the top recommendation, a secondary mention, or an item in a list, it counts. You can also track position separately: how often you appear as position 1, 2, 3, etc. That gives you a more nuanced view of your ranking quality beyond raw mention rate.
Is it worth paying for an AEO tracking tool vs doing this manually?
Manual tracking is viable for a single brand with a 40-60 query set. If you are tracking multiple brands, multiple markets, or multiple AI tools simultaneously, a purpose-built AEO tool like Aeotics saves significant time and gives you more consistent data across a larger query set.
Aeotics tracks AI brand visibility across TOP AI models, updated weekly. See how your brand compares โ
Continue exploring
Explore AEO Measurement Guide
Jump to the related tool, market, and industry pages connected to AEO Measurement Guide.


