How AI systems access your website
Artificial intelligence tools such as ChatGPT, Gemini, Claude and Perplexity can access and analyse publicly available websites in different ways.
As AI search and AI-generated answers become more common, some businesses want to ensure their website is accessible to AI systems and can potentially appear within AI-generated responses.
This guide explains how AI systems access websites, how robots.txt works, what llms.txt is, and common misconceptions about AI website visibility.
How AI systems access websites
How AI systems access websites
AI systems can access websites in several different ways.
Live website browsing
Some AI tools can actively browse websites in real time using their own crawlers or search integrations.
For example:
- OpenAI uses the
OAI-SearchBotcrawler - Anthropic uses the
ClaudeBotcrawler - Perplexity uses the
PerplexityBotcrawler
These crawlers work similarly to traditional search engine crawlers such as Googlebot or Bingbot.
Search engine integrations
Some AI tools rely partly on traditional search engines.
For example:
- AI systems may retrieve results from Bing or Google
- AI systems may summarise search engine results rather than directly crawling the website themselves
Previously indexed or trained information
Some AI models may answer questions using information previously indexed, cached or used during model training.
This means:
- A website may still appear in AI responses even if it is not currently being crawled
- A newly launched website may not yet appear in AI-generated responses
Understanding `robots.txt`
Understanding `robots.txt`
The robots.txt file is a standard file used to provide instructions to website crawlers.
It is available on your ShopWired website at https://yourwebsite.com/robots.txt
The file allows website owners to:
- Block crawlers from certain pages or directories
- Allow crawlers to access certain content
- Specify sitemap locations
Example:
User-agent: *
Disallow: /checkout/
Disallow: /account/
This example tells crawlers not to access the /checkout/ or /account/ areas of the website.
Does ShopWired block AI crawlers?
Does ShopWired block AI crawlers?
ShopWired does not block major AI crawlers from accessing storefront websites unless:
- You have manually configured your
robots.txtfile to block them - Your website is password protected
- Your website or server configuration prevents crawler access
If your website is publicly accessible then AI crawlers can generally access it unless specifically blocked.
Why an AI tool may say it "cannot read" your website
Why an AI tool may say it "cannot read" your website
Customers sometimes ask AI tools whether they can access a website and receive responses such as:
- "I cannot access this website"
- "I cannot read the robots.txt file"
- "This website is blocked"
This does not necessarily mean the website is actually blocked from AI crawlers.
Possible reasons include:
- The AI model/session does not currently have browsing enabled
- The AI tool failed to retrieve the website temporarily
- The AI misunderstood the prompt
- The AI hallucinated a technical explanation
- The crawler has not indexed the website recently
- The website returned a temporary error
For example, asking:
Can you read my website?
may produce a different response from:
Is my website publicly accessible to AI crawlers?
What is `llms.txt`?
What is `llms.txt`?
llms.txt is an emerging proposal intended to help AI systems understand website content.
The proposed file may contain:
- Information about the website
- Important pages
- Documentation links
- Structured summaries
Example location https://yourwebsite.com/llms.txt
However:
llms.txtis not currently an official internet standard- Most AI systems do not require it
- Having an
llms.txtfile does not guarantee visibility in AI responses
At present, llms.txt should be considered experimental rather than essential and ShopWired does not currently create a llms.txt file for your website.
What improves AI visibility?
What improves AI visibility?
There is currently no guaranteed way to make a website appear in AI-generated answers.
However, the following factors are likely to help.
Public accessibility
Your website should:
- Be publicly accessible
- Not require logins for important content
- Not block crawlers unnecessarily
Strong technical SEO
Many AI systems rely partly on search engines and indexed content.
Good SEO practices remain important, including:
- Clear page titles
- Structured headings
- Fast page loading
- Mobile-friendly design
- Internal linking
- Sitemap generation
High quality content
AI systems are more likely to reference:
- Clear informational content
- Helpful articles
- Detailed product descriptions
- Frequently updated content
Brand mentions and authority
AI systems may place greater confidence in:
- Well-known brands
- Frequently referenced websites
- Websites with strong backlinks and authority
Should you manually edit your `robots.txt` file?
Should you manually edit your `robots.txt` file?
ShopWired advises against manually editing your robots.txt file unless you fully understand the impact of the changes. You should also be careful of using AI to generate a robots.txt file for you.
Incorrect rules can accidentally:
- Block search engines
- Block AI crawlers
- Reduce SEO visibility
- Prevent pages from appearing in search results
How to check if your website is publicly accessible
How to check if your website is publicly accessible
You can usually check by:
- Opening the website in a private/incognito browser window
- Checking whether pages load without logging in
- Visiting
/robots.txt - Checking whether important pages are accessible publicly
Example AI crawler user-agents
Example AI crawler user-agents
Some known AI-related crawlers include:
| AI platform | Example crawler |
|---|---|
| ChatGPT/OpenAI | OAI-SearchBot |
| Anthropic Claude | ClaudeBot |
| Perplexity | PerplexityBot |
| Google Gemini | Google-Extended |
These may change over time.
Frequently asked questions
Frequently asked questions
Does allowing AI crawlers guarantee my business appears in ChatGPT?
No.
Allowing crawler access does not guarantee:
- AI visibility
- AI recommendations
- Inclusion in AI-generated answers
AI systems decide independently which sources to use.
Does ShopWired automatically support AI crawlers?
ShopWired storefront websites are publicly accessible by default unless restricted through configuration or custom rules.
Do I need an `llms.txt` file?
No, your website does not need an llms.txt file.
It is not required for normal AI crawler access.
Can AI systems ignore `robots.txt`?
Well-behaved crawlers generally respect robots.txt.
However, robots.txt is a voluntary standard rather than a technical enforcement system.
Does AI visibility replace SEO?
No.
Traditional SEO remains extremely important because many AI systems rely on:
- Search engine indexes
- Structured content
- Website authority signals
AI visibility should currently be considered an extension of SEO rather than a replacement for it.