The AI-readability checklist: 12 things to fix this week

A short, opinionated list of fixes that meaningfully improve how AI crawlers see your site. Most are boring. All matter.

By Gaurav HenryJul 3, 202610 min read

TL;DR. A 12-item AI-readability checklist for B2B sites: allow the AI bots in robots.txt, server-render high-intent pages, ship clean Organization / Product / FAQPage schema, build canonical competitor comparison pages, use semantic HTML for tables, tighten your entity description across the open web, write the answer in the first paragraph, earn third-party citations, and monitor AI visibility weekly. Most are boring. All are doable in a sprint. None of them is a clever trick. The engines have started punishing clever tricks.

Most “AI optimization” advice is either too vague to be useful (be helpful, write good content) or too clever to be true (write Python prompts in your robots.txt to bribe ChatGPT). The actual work is a list of small unsexy fixes that nobody is rewarded for shipping. Those fixes are also the ones that, when Zaraftis runs them across a customer base of roughly 1,200 B2B brands, lift citation share more reliably than anything else.

Here are the twelve fixes Zaraftis keeps in the top-of-mind list. None are theoretical. Each one has fixed a problem Zaraftis has seen in a real audit, often a problem the engineering team did not know existed. Work top to bottom. Most teams can clear the list in a week if they make it a priority.

Allow the AI bots in your `robots.txt`

The number one mistake, in roughly 18% of B2B sites Zaraftis audits, is that someone in 2023 added a blanket disallow for “AI scrapers” without realizing the disallow now blocks the bots that produce most of their AI visibility. The bots you actually want to allow are GPTBot (OpenAI), Google-Extended (Gemini), ClaudeBot and anthropic-ai (Claude), PerplexityBot, OAI-SearchBot, and Applebot-Extended. Open your robots.txt right now. If any of those are disallowed, that is your first action item.

Server-render your high-intent pages

If your pricing, comparison, and product pages are rendered client-side and only assembled after a JS bundle runs, the AI crawler is probably seeing a near-empty page. Some bots execute JS now. Most still prefer rendered HTML, and “prefer” in this context means “downrank everything else.” Use the View Source trick: load your highest-intent page, view source, and search for the most important sentence on the page. If the sentence is not in the HTML, fix that page. The other pages can wait.

Add a clean `Organization` schema with `sameAs`

Organization schema is what tells AI engines that the various names you go by are the same entity. The sameAs array should include your LinkedIn page, your Crunchbase profile, your Wikipedia entry if you have one, and any major directory listing. Zaraftis sees a lot of sites with no Organization schema at all, and a lot of sites with Organization schema that contradicts the visible page content. Either is bad. The fix is one block of JSON-LD in your site-wide head. Validate it.

Put `Product` or `SoftwareApplication` schema on your product pages

If you sell software, Product / SoftwareApplication schema is where you put your name, description, category, pricing, and aggregate rating in machine-readable form. AI engines pull pricing from these blocks directly. The “starting at $X per seat per month” number that shows up in answers about your category is almost always coming from someone’s offers field. If your competitor has the field and you do not, your competitor is the one whose price gets quoted.

Use real `FAQPage` schema, sparingly

FAQPage schema is one of the most overused tools in SEO and one of the easiest ways to get punished by AI engines, which appear to be tuning trust away from generated FAQ blocks. Use FAQPage schema only on pages where you have real, buyer-asked questions, with answers a buyer would actually want. Three real FAQs beat fifteen synthetic ones. Zaraftis has watched citation share drop on brands who put FAQPage schema on every page indiscriminately.

Build one canonical comparison page per major competitor

“X vs Y” pages are still one of the most cited content formats inside Perplexity and ChatGPT, when they are well done. Well done means: a real table at the top, semantic HTML headers (use <th>, not <div> styled to look like one), explicit pricing, an honest assessment of where the competitor is better, and a clear summary at the bottom. The mediocre version of this page (a marketing-speak listicle that says you are best at everything) does not get cited. The honest version does.

Use semantic HTML on data-heavy pages

This sounds like 2008 advice and is not. AI engines pull comparison tables, pricing tables, and feature matrices into their answers very directly. They pull cleanly when the table is a real <table> with <thead> and <th>, and badly when the table is a stack of divs styled into a grid. The fix is one engineering ticket. The benefit shows up across every engine.

Tighten your entity description across the open web

Look up how your brand is described in five places: your own about page, your LinkedIn company page, your Crunchbase entry, your G2 / Capterra profile if you have one, and the first sentence of your most-cited press piece. If those five descriptions disagree, the AI engine is reading a fragmented entity, and the fragmentation is costing you citations. Pick one canonical sentence (10 to 25 words) and propagate it. Boring. Worth a sprint.

Build citation hooks: definitions, named frameworks, statistics

The single biggest content pattern Zaraftis sees AI engines latch onto is “this thing has a name and the name is on this page.” Definitions of category-specific terms, frameworks you have named (a “[Brand] [Method]” or similar), original statistics from your own data: each one gets cited because each one is easy to attribute. If you have a number, give the number a sentence of its own. If you have a framework, give the framework a name and a page. Each becomes a hook the model can latch onto. (More on what makes a passage attributable in How AI engines decide which brands to cite.)

Write the answer in the first paragraph

Old SEO content was structured to keep readers on the page (delay the answer, build anticipation). AI synthesis works differently. The model is going to chunk your page, embed those chunks, and retrieve the chunk that most directly answers the prompt. If your answer is in paragraph nine, paragraph nine has to do all the work. If your answer is in paragraph one, every chunk in the page benefits from the context. Lead with the answer. Add the nuance after.

Get three to five real third-party reference mentions per quarter

Perplexity in particular is going to dramatically prefer sources that are not from the brand itself. If your only citation surface is your own domain, you are leaving Perplexity visibility on the floor. The fix is not “do PR.” The fix is to be in the structured listicles, comparison articles, and category roundups that buyers (and AI engines) reference. One well-placed listicle on a high-authority third-party site moves the needle more than ten guest posts.

Set up monitoring so you know when something breaks

The thing nobody warns you about: AI visibility can drop overnight. A schema change goes live and stops validating. A bot gets accidentally re-blocked in a robots.txt update. A retrained model decides it likes your competitor more this week. If you are not measuring, you find out three months later when pipeline has already softened. Track AI visibility, share of voice, citation share, and sentiment on a weekly cadence at minimum. (For the case that AI visibility is the number to track above all, see AI visibility is the new market share.)

Should you still write FAQ-style content for AI?

The conventional advice you will see on most “GEO checklist” articles is “write FAQ-style content for AI.” The advice made sense in 2024. In 2026, FAQ-block farming is one of the lowest-return things on the list, and overdoing it can actively hurt you. AI engines have started discounting heavily-templated FAQ content, especially where the questions feel synthetic. Zaraftis has watched brands drop their FAQ block density and see citation share rise. The lesson is that the engines are getting better at telling the difference between content written for buyers and content written to game them. Anything that feels written-for-the-bot is going to age badly. Write for buyers, mark up the structure cleanly, and the bots will follow. (More on the shift from rank-thinking to answer-thinking in The end of keyword rankings.)

What is missing from this 12-item list?

This is a 12-item list because the goal is something a team can actually finish. The Zaraftis full audit is 50 points. The extra 38 points are mostly variations on these themes (deeper structured data validation, per-page accessibility checks, more granular schema choices, citation graph mapping, prompt-level coverage testing) and a few that are highly specific to vertical or stack. If you do these twelve well, you are ahead of roughly 80% of the B2B brands Zaraftis has audited, and the next 20% requires the kind of sustained program that is what GEO actually is.

The honest punchline is that there is no clever trick. There is a list. The list is doable in a sprint. The brands that do the list look prescient in two quarters. The brands that wait for a more sophisticated playbook will get one, but the playbook will be an expensive one, and the playbook will start with these twelve items anyway.

Frequently asked questions about AI-readability

Q: Which AI bot is most important to allow in robots.txt?

A: There is no single most important bot, because different engines use different crawlers. Allow GPTBot (OpenAI / ChatGPT), Google-Extended (Gemini and AI Overviews), ClaudeBot and anthropic-ai (Claude), PerplexityBot, OAI-SearchBot, and Applebot-Extended. Blocking any one of them locks you out of every answer that engine produces.

Q: Do AI engines actually execute JavaScript?

A: Some do, some do not, and the ones that do still prefer server-rendered HTML. Treat client-only rendering on a pricing or product page as a citation-loss bug. The simplest check is to open View Source on your highest-intent page and search for the page’s most important sentence. If it is not there, the AI crawler is not seeing it.

Q: Should every page have FAQPage schema?

A: No. AI engines have started discounting templated FAQ blocks, and brands that strip indiscriminate FAQPage schema often see citation share rise. Use FAQPage only on pages with real buyer-asked questions and real answers. Three real FAQs beat fifteen synthetic ones.

Q: How do I know if my entity description is fragmented?

A: Look up how your brand is described in five places: your about page, your LinkedIn company page, your Crunchbase entry, your G2 / Capterra profile, and the first sentence of your most-cited press piece. If those five descriptions disagree, the AI engine is reading a fragmented entity. Pick one canonical 10-to-25-word sentence and propagate it.

Q: How often should I re-check this list?

A: Run the technical checks (robots.txt, server-render, schema validation) any time you ship a site-wide change to those layers. Re-audit the off-site entity layer (LinkedIn, Crunchbase, G2) once a quarter. Track AI visibility, share of voice, citation share, and sentiment on at least a weekly cadence. A regression can show up overnight.

Q: How long until these fixes show up in AI visibility numbers?

A: AI engines refresh their grounding more lazily than Google’s organic index. Expect visible movement at T+14 to T+28 days after meaningful changes ship. robots.txt unblocks tend to surface faster (often inside the first week); schema and entity-description changes are slower.

How we know

The figures in this article come from the Zaraftis platform: a prompt-tracking and audit system that runs buyer-intent prompts against ChatGPT, Gemini, Perplexity, Claude, Google AI Overviews, AI Mode, Copilot, and Grok on a weekly cadence, alongside a 50-point AI-readability audit on each tracked site. Specific figures used above include the roughly 18% of B2B sites with mis-configured robots.txt, the citation-share lift observed when brands strip indiscriminate FAQPage schema, and the “80% of audited brands” benchmark for completing the 12-item list. Figures cover approximately 1,200 B2B brands tracked between November 2025 and May 2026.

Want the other 38 points?

Zaraftis runs a 50-point GEO audit on your site and produces a prioritized fix list, not a 90-page PDF. Your 7-day free trial includes a full scan and the first audit, so you see the gaps before you pay.

Start free trial →

AI visibility is the new market share

A new KPI for the era when buyers ask AI before they ask Google. How to measure it, and what good actually looks like.

Technical · 14 min read

How AI engines actually decide which brands to cite

Inside the citation logic of ChatGPT, Gemini, and Perplexity, and the structured signals each one quietly rewards.