Walk through these 5 query categories on at least 3 models (ChatGPT, Claude, Gemini). Each one probes a different dimension of AI's understanding.
"What is [your brand]?"
"Describe [your brand]'s main products and services."
What you're testing: Does AI have basic parametric knowledge of you?
Look for: Correct industry classification, accurate product descriptions, current information.
Red flags: Wrong category entirely, outdated descriptions, confusion with similarly-named companies.
"What are the best [your category] companies?"
"Compare [your brand] to competitors in [your space]."
What you're testing: Whether AI includes you in the competitive set where buyers are looking.
Look for: Your brand appearing in the list. Position relative to competitors.
Red flags: AI lists 10 competitors and you're not among them. You're invisible in the exact moment a buyer is deciding.
"I need a [your solution type] for [common use case]. What should I consider?"
What you're testing: This is the money query. It simulates an actual buyer asking AI for recommendations.
Look for: Your brand appearing as a recommendation with accurate positioning.
Red flags: If you don't appear, you're losing pipeline to whoever does.
"What are the strengths and weaknesses of [your brand]?"
What you're testing: What narrative AI has internalized about you.
Look for: Strengths that match your actual positioning. Weaknesses that are fair.
Red flags: AI thinks your strength is something you deprioritized two years ago — your parametric knowledge is stale.
"Is [your brand] better than [top competitor] for [your core use case]?"
What you're testing: Whether AI positions you correctly relative to alternatives.
Look for: Accurate comparison of differentiators.
Red flags: AI recommending the competitor for your strongest use case, or describing your differentiators as shared features.
Important: run every query on at least 3 models (ChatGPT, Claude, Gemini). They each have different crawling behavior and different "versions" of your brand.
For buyer intent queries specifically, a good response names you in the top 3 recommendations with your actual differentiator. A bad response lists 5 competitors and doesn't mention you, or mentions you but describes your old product. For competitor displacement queries, pay attention to how AI frames the comparison. Does it say "[competitor] is the market leader and [you] is an alternative" or does it present you as equals? The framing reveals AI's confidence hierarchy — and that hierarchy is what determines who gets the recommendation when a buyer asks a neutral question.
We audited a mid-market cybersecurity company (anonymized). Here's what we asked and what each model said:
| Query Category | ChatGPT | Claude | Gemini |
|---|---|---|---|
| Direct identity | ✓ Correct — "cybersecurity platform" | ✗ Called them "IT staffing agency" | ✓ Correct but outdated product list |
| Category placement | ✗ Not listed in "best cybersecurity tools" | ✗ Not listed | ✗ Not listed |
| Buyer intent | ✗ Recommended 3 competitors | ✗ Recommended 2 different competitors | ✗ Didn't mention category |
| Sentiment probe | Mixed — cited outdated weaknesses | ✗ Described IT staffing strengths | Neutral, generic description |
| Competitor displacement | Competitor framed as "leader" | Couldn't compare — wrong category | Fair comparison, lacked depth |
Diagnosis: Misread on Claude (wrong category entirely), Phantom on category/intent queries across all models (present when asked directly, invisible when buyers are searching). Total time for manual audit: 45 minutes. This brand had no idea AI was misrepresenting them.
The root cause was clear: their Crunchbase profile still listed "IT staffing" from their pre-pivot days. Claude weighted this heavily. ChatGPT's retrieval pulled from their updated website and got it right, but the buyer intent and category queries relied on parametric knowledge — which was wrong everywhere.
Use the archetype framework as the diagnostic tool. After running the queries, the self-diagnosis takes 60 seconds:
Signal: AI doesn't mention you at all across most queries
Urgency: High — every AI-assisted buying decision happens without you.
Signal: AI mentions you but gets significant facts wrong
Urgency: Highest — AI is actively steering buyers away with incorrect information.
Signal: AI mentions you correctly but names competitors first
Urgency: Medium — you're present but not preferred.
Signal: AI describes you accurately and recommends you
Urgency: Monitor — maintain your position.
Each AI model has two types of knowledge: parametric (baked in during training) and retrieved (fetched in real-time). The balance differs by model.
ChatGPT relies more on retrieved data — OpenAI sent 10,816 crawler requests to our directory this week.
Claude leans more on parametric knowledge — Anthropic sent 4,669 requests.
Perplexity is retrieval-first — 1,699 requests, each answering a live user query.
A brand might score well on ChatGPT (which just retrieved your updated page) but poorly on Claude (working from older training data). This is why single-model testing gives a false sense of security.
| Model | Crawl Volume | Knowledge Type | Update Speed | Implication |
|---|---|---|---|---|
| ChatGPT | 10,816 req/wk | Heavy parametric + retrieval | Fast (active crawling) | Most likely to have current data, but parametric memory can conflict with retrieved data |
| Claude | 4,669 req/wk | Parametric-heavy | Moderate | May lag behind on recent changes; strong on established brands |
| Perplexity | 1,699 req/wk | Retrieval-first | Real-time | Most accurate for current state, least affected by stale training data |
| Gemini | Varies | Mixed, Google-indexed | Google-indexed | Leverages Google's index; strong where Google is strong |
The practical takeaway: if you only test one model and it looks good, you'll miss that another model has you completely wrong. ChatGPT's high crawl volume means it often has the most current data — but that also means a recent error on your site propagates to ChatGPT faster than to other models. Test at least 3 models for every query category. Read more about crawler behavior in our crawler data report.
The remediation framework, based on our data from 5,829 brands:
Fix your structured data
Add Organization schema with correct industry, product descriptions, and founding date. Guide →
Align your authoritative sources
Make sure your website, Crunchbase, LinkedIn, Wikipedia, and G2 all tell the same story. Guide →
Create an llms.txt file
The AI-specific equivalent of robots.txt — tells AI models what your brand is and does. Guide →
Configure your robots.txt
Ensure AI crawlers (GPTBot, ClaudeBot, PerplexityBot) are welcomed, not blocked. Guide →
Create content that corrects misclassifications
If AI thinks you're in the wrong category, publish a definitive page that corrects the record. Guide →
If you have 1 brand, the manual method works. Run the 5 query categories across 3 models quarterly, track the changes in a spreadsheet, and you'll have a reasonable picture.
If you're managing multiple brands, need to track changes over time, or need to audit across 4+ models simultaneously — that's where tooling helps. Our free AI Visibility Checker runs the full methodology and gives you a BAI score in seconds.