Known Crawler Whitelisting in VergeCloud

Managing Known Crawlers in VergeCloud

Bots—also called crawlers or spiders—are automated programs used by search engines and online services to scan and index websites. They play a key role in helping your content appear in search results, power AI models, and gather performance insights. While most are safe and useful, it’s important to recognize and manage them to protect your site’s security and SEO performance.

VergeCloud automatically whitelists IP addresses of well-known crawlers such as Googlebot, Bingbot, and others to ensure seamless indexing and better SEO performance. These trusted bots are verified using official sources, helping you maintain a secure and search-friendly environment without extra configuration.

Why This Matters

Validate crawler authenticity: Confirm that bots accessing your site are genuine and from trusted sources.
Enhance security: Block unwanted or suspicious bots while allowing trusted ones.
Improve site visibility: Ensure legitimate bots can crawl your content without interruptions.
Stay updated: Access the latest bot IP ranges from official sources.

Note: VergeCloud references the IP2Location database to identify and verify bot traffic where no official source is available.

How to control bots and crawlers?

If you'd like to fully manage crawler access yourself, you can disable the global whitelist by using VergeCloud’s API and setting the skip_global_whitelist field.

Disabling the global whitelist may prevent legitimate search engine bots from crawling your content, affecting your site's visibility in search results. If you disable known but whitelist for your website you need to manage them by firewall rules and keep them update.

Example API call to disable the global whitelist:

curl --location --request PATCH 'https://api.vergecloud.com/cdn/4.0/domains/example.com/firewall' \
--header 'Authorization: API KEY' \
--header 'Content-Type: application/json' \
--data '{"skip_global_firewall": true}'

Custom Rules for Crawler Management

You can create custom firewall rules to selectively allow or block specific crawler IPs. Create firewall rules and set allow or block crawler IPs. Read more about firewall rules.

For example, you can:

Allow access to only Googlebot and block all others
Temporarily block traffic from known crawlers during a maintenance window

Trusted Crawlers and Reference Links

Below are the supported crawler bots with their official IP verification sources:

Google: Verify Googlebot
Meta (Facebook): Meta Web Crawler Docs
Bing: Bingbot IP List
Apple: Applebot JSON
OpenAI: OpenAI Bot Docs
Internet Archive:
- IP Ranges: 207.241.224.0/20 and 208.70.24.0/21
- User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/605.1.15 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/605.1.15
Yandex:
- Official Robot Guidelines
- Unofficial IP List
Yahoo: IP2Location (Unofficial)
Baidu: IP2Location (Unofficial)
Ahrefs: Ahrefs Crawler IPs

Related Articles
Steps to Activate Cloud Icon for VergeCloud
Checks Before Activating Cloud Icon For Domain Once you've transferred your website to VergeCloud and updated your DNS settings, there are a few steps to complete before activating the Cloud icon for the records in the VergeCloud user panel. These ...
VergeCloud HTTP Headers
Headers Sent by CDN to User and Origin Server When a website utilizes VergeCloud CDN, visitor requests are directed to VergeCloud CDN servers rather than the original server that hosts the site. In reply to these requests, the CDN edge server ...
Understanding VergeCloud’s DDoS Challenge Modes
Understanding VergeCloud’s DDoS Challenge Modes VergeCloud’s DDoS protection uses multiple layers of mitigation to protect against both network-level (Layer 3 & 4) and application-level (Layer 7) attacks. Each challenge mode handles threats ...
Essential Steps Before Changing Nameservers to VergeCloud
Considerations Verify A Records and Their IP Addresses: After registering your domain in the VergeCloud User Panel, the first action is to confirm that the A records have been transferred correctly and that the associated IP address is accurate. ...
How to Whitelist VergeCloud’s IP Addresses in Your Firewall
Why You Need Whitelist VergeCloud’s IP Addresses in Your Firewall To ensure smooth and uninterrupted communication between VergeCloud’s edge servers and your main host server, it's crucial to whitelist VergeCloud’s IP addresses in your firewall ...

Crawler IP Whitelisting in VergeCloud

Known Crawler Whitelisting in VergeCloud

Managing Known Crawlers in VergeCloud

How to control bots and crawlers?

Custom Rules for Crawler Management

Trusted Crawlers and Reference Links

Related Articles

Steps to Activate Cloud Icon for VergeCloud

VergeCloud HTTP Headers

Understanding VergeCloud’s DDoS Challenge Modes

Essential Steps Before Changing Nameservers to VergeCloud

How to Whitelist VergeCloud’s IP Addresses in Your Firewall