Skip to main content
web60

Industry News

AI Bots Are Now Half Your Website Traffic. Your Hosting Plan Was Not Built for That.

Ian O'Reilly··8 min read
Abstract teal and navy flat illustration of many small geometric nodes converging on a single origin point on a warm grey background

The most active visitor to your business website last month was not a customer. It was almost certainly a bot, and there is a better than even chance it was an AI bot training a large language model on your pricing page.

This is not a hypothetical. Cloudflare's Radar data for the end of 2025 put bot traffic at somewhere between 51 and 52 percent of the global web, with AI crawlers the fastest-growing slice. Their CEO told TechCrunch in March that bot traffic will overtake human traffic outright by 2027 [1]. Reviewing our own platform dashboards this morning, I can tell you those numbers are not abstract. They are the shape of what we see.

Most owners have no idea. Their hosting provider does not mention it. Their analytics dashboard does not show it cleanly. They keep paying. Their site keeps getting a bit slower. And the traffic "spike" they celebrated last month was not a marketing win. It was OpenAI scraping their product catalogue.

What the numbers actually look like

Cloudflare's year-end data is worth sitting with for a minute. AI bot traffic grew in the region of 187 percent year-on-year across 2025, while human traffic rose roughly 3 percent [1]. That is not an adjustment. That is a different internet.

Googlebot still leads the AI crawler pack at around 38 percent of AI-related bot traffic, with GPTBot at roughly 13 percent, Meta-ExternalAgent at 11, and ClaudeBot at another 11 [2]. Applebot is climbing fast. And that is before you count the user-action crawlers (ChatGPT-User, Claude-SearchBot, PerplexityBot) that fetch your content in real time every time a human asks an AI tool a question about your business.

Two things follow. Your hosting is handling more traffic than you thought. And the traffic you see in Google Analytics is not the traffic your server is actually serving.

The hidden bandwidth bill

Here is the part that gets overlooked. Bandwidth is not free. On cheaper hosting plans it is capped, throttled, or billed by the gigabyte past a soft limit. A crawler does not care about your overage charges.

Read the Docs published a particularly useful worked example in 2024. By blocking aggressive AI crawlers at their edge, they cut their daily outbound bandwidth by roughly 75 percent, from about 800 GB a day down to 200 GB, saving around 1,500 dollars a month [3]. One single crawler had downloaded 73 TB of zipped HTML in one month, ten of those terabytes in a single day, costing them more than 5,000 dollars before they noticed. That is a documentation site.

The alternative reality for a small business is quieter. You do not get a 5,000 dollar surprise. You get a site that runs a little slower every month, a hosting bill that creeps up at renewal, an overage email you did not expect, and a PageSpeed score that quietly drops because your server is spending cycles feeding ClaudeBot instead of the customer standing in a car park in Galway trying to check your opening hours on their phone. That customer does not send you an angry email. They just close the tab.

Your analytics are lying to you

This is the other half of the problem. Independent Analytics dug into how AI traffic attributes in Google Analytics 4 and found that over 70 percent of AI-related visits land in the Direct bucket, with the rest scattered across categories that were never designed for this [4].

In plain English, a chunk of what you think is people typing your URL into a browser is actually a bot fetching content for an AI answer engine.

Make marketing decisions off that and you are guessing. Your email campaign looks like it worked. Your social push looks quiet. Really, GPTBot did a sweep of your site on Tuesday and the Direct line spiked. The numbers you are using to pick winners are corrupted at the source.

Abstract teal network with many overlapping circular nodes converging on a single origin point suggesting automated traffic reaching a server
Most of what reaches your origin server is no longer a person.

Why cheap hosting is the wrong tool for this era

I run operations. This is what I see every week across the platform.

On cheap shared hosting, every site on the server shares the same CPU, RAM, and disk IO budget. A single aggressive crawler on your neighbour's site can slow yours down. A single aggressive crawler on your own site can get you throttled by a host that does not want to explain why. Most shared hosts have no crawler detection worth the name. They serve whatever is requested. They hope for the best.

That model was fine when bots were two or three percent of traffic. It is not fine when they are fifty.

Early last year I did not have a good enough read on how aggressive the new training bots had become. A site on our platform saw a tenfold jump in requests over a weekend and I assumed it was a marketing push going viral. It was GPTBot hammering a product catalogue. I changed how I read our dashboards that week.

What a serious setup actually delivers

Describe the job generically first. Proper infrastructure for this era of the web does three things. It serves cached content cheaply so crawler volume does not translate into origin cost. It gives your site real, not shared, resources. And it is transparent enough about what is hitting your origin that you can make a decision.

Web60's Irish-hosted managed WordPress infrastructure meets that brief without asking you to configure a thing. Nginx fronts PHP-FPM. Redis handles objects. FastCGI page caching serves the same blog post to the thousandth bot request for effectively no origin work. Crawler hits mostly land on cache. Your site stays fast for the actual customer. Bandwidth does not disappear feeding AI training runs. The platform built to absorb the traffic mix the web now produces is there behind the 60-second setup, not sold as an add-on.

Strategic concession, genuinely meant. If you are a large publisher doing hundreds of millions of page views a month and your business model is news, enterprise bot management at the edge (Cloudflare Business, Akamai) is the right tier. That is a real product for a real problem. It is not most businesses.

Do you even want the bots there?

Cloudflare made the bigger call in July 2025. New domains on their network default to opt-in: AI crawlers are blocked unless the owner explicitly allows them [5]. They paired it with a Pay Per Crawl marketplace where publishers can charge AI companies per request. The direction of travel is clear. The era of AI companies taking content for free because they can technically reach it is closing.

There is an honest limitation here. You cannot block everything and still show up where you want to. Googlebot is how you get into Google search. ChatGPT-User, Claude-SearchBot and PerplexityBot are how your business shows up when younger customers ask an AI tool where to eat, who to hire, where to buy something. Blocking all of them is a bad strategy. Allowing all of them and paying for the privilege in bandwidth, slow pages and skewed analytics is not a better one. The operator decision is what you let in and what you charge, not an all-or-nothing switch.

This sits inside a broader shift AI has made across WordPress infrastructure in 2026, and the fact that the AI website builder market crossed three billion dollars last year tells you how much training data these companies think they still need. The demand on your origin is not going down.

The upshot

You do not need to solve AI crawlers today. What you need is hosting that absorbs crawler load without you paying for it in surprise overages or slow pages, and an analytics view that separates the humans from the bots so your marketing decisions are based on something real. Once you have that, the rest becomes a strategic question rather than an emergency one.

Keep running the business. Let the platform handle the traffic mix the internet now produces.

Sources

IO
Ian O'ReillyOperations Director, Web60

Ian oversees Web60's hosting infrastructure and operations. Responsible for the uptime, security, and performance of every site on the platform, he writes about the operational reality of keeping Irish business websites fast, secure, and online around the clock.

More by Ian O'Reilly

Ready to get your business online?

Describe your business. AI builds your website in 60 seconds.

Build My Website Free →