Troubleshooting bot access issues
Overview
This guide helps you troubleshoot issues with search engine bots (like Googlebot, Bingbot, or Yahoo Slurp) not being able to access your website. If you’re relying on SEO (Search Engine Optimization) to improve your site’s visibility, ensuring bot access is critical.
Common Causes
Common causes
Bots are typically blocked for one of two reasons:
robots.txtis denying access.- The bot’s IP address is blocked by the server firewall (CSF).
1. Check “robots.txt” permissions
Start by reviewing your robots.txt file. It’s usually located at:
/home/USERNAME/public_html/robots.txt
View file contents
Open the file in a text editor:
vim /home/USERNAME/public_html/robots.txt
Look for entries like this:
User-agent: Googlebot
Disallow: /
If your robots.txt disallows bots you actually want, remove or modify those lines.
Check file permissions
Ensure the file is publicly viewable:
ll /home/USERNAME/public_html/robots.txt
Example output:
-rwxrwxr-x 1 user user 23 Aug 21 2012 /home/USERNAME/public_html/robots.txt
Test access in browser
Visit:
https://example.com/robots.txt
If the file loads as plain text, bots should be able to read it. You can also ask your customer to test via Google Search Console.
If the file doesn’t load, recheck ownership and permissions.
2. Check for IP blocks (CSF/ModSecurity)
Sometimes bots get blocked due to repeated mod_security triggers, especially if CSF is installed.
Search deny lists
Run the following to check if common bots are blocked:
grep .googlebot.com /etc/csf/csf.deny; grep .googlebot.com /var/lib/csf/csf.tempban
grep .crawl.yahoo.net /etc/csf/csf.deny; grep .crawl.yahoo.net /var/lib/csf/csf.tempban
grep .search.msn.com /etc/csf/csf.deny; grep .search.msn.com /var/lib/csf/csf.tempban
Example result:
66.249.73.40 # lfd: (mod_security) mod_security triggered by 66.249.73.40...
Unblock the IP
If you find a legitimate bot IP, remove it from the firewall:
csf -tr 66.249.73.40
Prevent future blocks
Add trusted bot hostnames to the CSF ignore list:
Edit /etc/csf/csf.rignore and add:
.googlebot.com
.crawl.yahoo.net
.search.msn.com
Then restart CSF:
csf -r
Summary
This guide helps troubleshoot issues where legitimate bots (like Googlebot) can’t access a website, which can negatively impact SEO. It covers two main causes: blocks in the robots.txt file and IP blocks by CSF due to mod_security triggers. The guide includes steps to review permissions, test bot access, remove firewall blocks, and prevent future issues by whitelisting trusted bots.