Web crawlers operated by AI companies (OpenAI's GPTBot, Anthropic's ClaudeBot, Perplexity's PerplexityBot, etc.) that index sites for AI engine training and real-time retrieval.
AI crawler bots are user-agents operated by AI companies to index web content for two purposes: (1) training future model versions, (2) real-time retrieval for RAG-enabled queries. Major AI crawlers include OpenAI's GPTBot, Anthropic's ClaudeBot, Perplexity's PerplexityBot, Google-Extended (for Google's AI products), and Bytespider (TikTok/ByteDance). Some sites block AI crawlers via robots.txt; others allow them to ensure citation eligibility.
Blocking AI crawlers excludes your content from AI engine citation eligibility. Allowing them ensures your content is indexed and citation-eligible. Most brands should explicitly allow AI crawlers via robots.txt to maximize AEO impact.
A site's robots.txt explicitly allows GPTBot, ClaudeBot, PerplexityBot, and Google-Extended. Within weeks of launching, the site's pages start appearing as citations in Perplexity, ChatGPT (with web browsing), and Google AI Overviews. A competitor blocks the same crawlers — their content isn't cited even when relevant.
The terms in this glossary aren't theoretical — they're what Lantern's product calculates and reports every month for B2B SaaS teams. See yours in 7 days. 14-day free trial.
Join Waitlist