The AI Brand Audit: What We Learned Scoring 5,829 Brands
We scored 5,829 brands on how accurately AI models — ChatGPT, Claude, Gemini, Perplexity — describe them. The results are worse than most companies think.
60%
Misrepresented by at least one AI model
7.5%
Functionally invisible to AI
8,008
Score changes tracked this week
This page presents our findings — the data, the patterns, the frameworks, and the methodology. It's designed to be useful whether you're a CMO trying to understand your own situation, an analyst mapping an emerging category, or a researcher studying how AI models represent commercial entities.
The BAI Distribution — Where 5,829 Brands Fall
| BAI Score | % of Brands | Count | Interpretation |
|---|---|---|---|
| 80–100 | 58.4% | ~3,405 | Strong AI presence |
| 60–79 | 25.3% | ~1,475 | Present but inconsistent |
| 40–59 | 4.9% | ~286 | Significant gaps |
| 20–39 | 3.9% | ~227 | Mostly invisible or wrong |
| 0–19 | 7.5% | ~439 | Functionally invisible |
The distribution isn't normal — it's bimodal. There's a large cluster of brands with strong AI presence (80–100) and a notable tail of brands that are invisible or misrepresented (0–39). The middle is thin. This suggests that AI brand perception tends to be binary: AI either knows you well, or it barely knows you at all. The brands in the middle are typically in transition — either improving from a fix or declining from neglect.
The Four Archetypes — How AI Misunderstands Brands
From 5,829 brand audits, we identified four distinct patterns in how AI models represent — or fail to represent — brands. This framework is designed to be diagnostic: identify which archetype describes your brand, and the remediation path becomes clear.
Incumbent — 379 brands (6.5% of directory)
AI describes them accurately and recommends them in relevant buying contexts. Strong structured data, consistent messaging across authoritative sources, and enough content depth for high-confidence representation.
The risk: Incumbency isn't permanent. Of the 8,008 score changes tracked this week, Incumbent brands weren't immune — model updates, competitor improvements, and source changes can erode even strong positions.
Challenger — 463 brands (7.9% of directory)
AI knows they exist and mostly gets the basics right, but they're not the first recommendation in their category. They appear in follow-up questions and comparisons but not in initial "best tool for X" queries.
The risk: The gap is usually source authority — their differentiators appear on their own website but haven't propagated to the third-party sources AI models weight most heavily.
Phantom — 111 brands (1.9% of directory)
AI doesn't mention them. At all. Across all query types and all models, they're absent. These brands have a discoverability problem — their web presence is either too thin, technically inaccessible to AI crawlers, or insufficiently differentiated.
The risk: This is the scariest archetype because these companies don't know they have a problem. Their Google rankings might be fine. Their paid campaigns drive leads. But every AI-assisted buying decision in their category happens without them on the consideration list.
Misread — 47 brands (0.8% of directory)
The highest-urgency archetype. AI mentions these brands but gets them wrong — wrong category, wrong product description, confused with a competitor. A Misread is worse than a Phantom because AI is actively sending the wrong signal to potential buyers.
The risk: Misread brands are actively losing deals they don't know about. When a buyer asks AI about their category, AI confidently directs them to the wrong company — or describes the brand so inaccurately that the buyer disqualifies them before ever visiting the website.
Self-diagnosis: Which archetype are you?
- Ask ChatGPT and Claude: "What is [your brand]?" → If neither answers → Phantom
- If they answer, compare to your actual positioning → If wrong → Misread
- Ask: "What are the best [your category] tools?" → If you appear → Incumbent. If not → Challenger
What AI Gets Wrong, By Category
| Category | Brands | Dominant Archetype | Most Common Issue |
|---|---|---|---|
| SaaS/Cloud Software | 429 | Challenger | AI places you in a generic 'SaaS' bucket rather than your specific subcategory |
| Fintech/Financial Services | 89 | Misread | Outdated descriptions — financial services companies rebrand frequently, AI training data lags |
| Retail/E-commerce | 63 | Incumbent | Category confusion between 'retailer' and 'e-commerce platform' |
| Professional Services | 50 | Challenger | Highest source disagreement rate — AI flattens nuanced positioning into generic categories |
| Healthcare/Life Sciences | 32 | Misread | Highest Misread rate — confuses medical devices with pharma, health tech with telehealth |
SaaS/Cloud Software is the most crowded category with 429 brands. The dominant archetype is Challenger — AI knows these brands exist but doesn't recommend them first. Healthcare/Life Sciences has the highest Misread rate, where AI frequently confuses medical device companies with pharma, and health tech with telehealth. The regulatory vocabulary that differentiates these businesses is lost in AI's categorization.
The Velocity of Change — 8,008 Score Deltas in One Week
AI brand perception isn't a snapshot — it's a moving target.
8,008
Total changes
74
Improved
464
Declined
462
Neutral shifts
The negative skew: For every brand that improved this week, six declined. This isn't because brands are getting worse — it's because AI models are constantly updating their representations, competitors are improving their presence, and information entropy favors decay.
Weekly volatility comparison: Week 12 had 724 deltas, Week 13 had 276 — a 2.6× variation. We hypothesize model update cycles. When a major model incorporates new training data, hundreds of brand representations shift simultaneously.
The practical implication: a single audit gives you a snapshot. Continuous monitoring tells you whether you're gaining or losing ground. The brands that maintain Incumbent status aren't the ones with the best initial score — they're the ones that track and respond to changes.
Methodology — How BAI Is Calculated
Answer Presence
We test each brand across a minimum of 15 queries spanning 5 categories (identity, category, buyer intent, sentiment, competitor displacement) on 4 major AI models. Answer Presence measures: in what percentage of relevant queries does AI mention the brand?
Message Pull-Through
For each mention, we compare AI's description to the brand's "ground truth" — their actual positioning, category, products, and key messages. Message Pull-Through measures: what percentage of the ground truth does AI accurately convey?
Owned Citations
We track whether AI's response cites the brand's own authoritative sources (website, documentation, official content) vs. third-party sources. Higher owned citation rates correlate with more accurate and more stable representations over time.
The BAI score combines these three dimensions on a 0–100 scale. The weighting accounts for practical business impact: a brand that's completely absent (low Answer Presence) is scored differently than a brand that's present but inaccurate (low Message Pull-Through).
What to Do About It
- 1
Define your ground truth
What should AI say about you? If you can't articulate this in 3 sentences, start here.
- 2
Audit your current state
Run the methodology above — manual or automated via our free tool.
- 3
Identify the gap
Compare ground truth to AI output. Classify by archetype.
- 4
Fix the sources
Not the model. The sources AI learns from. Structured data, authoritative third-party profiles, website content, technical discoverability.
- 5
Monitor the delta
Track whether your BAI score moves after fixes. The change data is your evidence that the intervention worked.
Frequently Asked Questions
Related Research & Guides
Data source & limitations: All data is from Optimly's AI Brand Directory of 5,829 brands as of March 2026. BAI scores are based on queries across ChatGPT, Claude, Gemini, and Perplexity. This dataset represents the brands in our directory — the full market may behave differently. Weekly delta data covers a 7-day period; longer-term patterns may differ.
