Why My Blog Suddenly Had 70,000 Daily "Visitors"
Cloudflare suddenly reported more than 70,000 daily visitors on my small programming blog. The graph was exciting for about five seconds, then suspicious.
Cloudflare analytics recently gave me a tiny emotional roller coaster.
First thought: "Wait, did one of my posts finally escape into the wild?"
Second thought: "Why does this traffic look like it was scheduled?"
My blog usually sits around 7,000 daily unique visitors, and a fair chunk of that is already bot traffic. Then, over two days, Cloudflare started reporting more than 70,000 unique visitors per day. The hourly numbers were strangely flat, around 3,600 per hour.
That shape mattered more than the total. Big numbers are exciting. Flat numbers usually have something to say.
The pattern that gave it away
The suspicious part was not only the volume. It was how neatly the traffic behaved.
- The jump happened almost instantly instead of building gradually.
- The hourly traffic stayed unusually even.
- Requests and bandwidth moved in a very steady, repeatable pattern.
- Threat detections spiked only occasionally, so much of the traffic still looked valid enough to pass normal filtering.
Real human traffic is messy. It follows time zones, work hours, social links, search spikes, and random attention. It does not usually wake up one night and continue in a straight line.
Once I looked at the graph as a shape instead of a number, the likely explanation became much less glamorous: this was almost certainly automated crawling.
The detail panels made that feeling stronger. They did not show a normal reader profile slowly growing around a few popular articles. They showed a mostly HTML-heavy crawl, a strange client mix, and traffic spread across places that made more sense for infrastructure than for a sudden fan club.
First, the client HTTP version breakdown was very lopsided:
| Version | Requests (24h) | Percentage of Requests |
|---|---|---|
| HTTP/1.1 | 137,088 | 81.87% |
| HTTP/2 | 23,424 | 13.99% |
| HTTP/3 | 6,929 | 4.14% |
| HTTP/1.0 | 9 | 0.005% |
That does not prove anything on its own, but it is another little clue. Modern browsers happily use HTTP/2 or HTTP/3 when they can. A wave dominated by HTTP/1.1 feels more like automated clients, libraries, proxies, or mixed crawler infrastructure.
The content type breakdown was even clearer:
| Type | Requests (24h) | Percentage of Requests |
|---|---|---|
| html | 151,441 | 90.73% |
| empty | 6,977 | 4.18% |
| js | 3,428 | 2.05% |
| txt | 1,621 | 1% |
| css | 1,619 | 1% |
| unknown | 795 | 0.5% |
| json | 286 | 0.2% |
| webp | 267 | 0.2% |
| xml | 179 | 0.1% |
| svg | 167 | 0.1% |
| png | 92 | 0.06% |
| ico | 38 | 0.02% |
More than 90% of requests were HTML. That is not how a normal browser session feels on a web page with CSS, JavaScript, icons, images, and follow-up navigation. It is exactly what I would expect from something that mostly wants page text.
The top source IPs also showed repeated request volume from individual addresses, but not enough to explain the whole wave by themselves:
| Source IP | Country | Requests (24h) |
|---|---|---|
| 216.73.217.52 | United States | 4.69k |
| 195.178.110.199 | Netherlands | 2.48k |
| 45.148.10.249 | Netherlands | 1.94k |
| 185.177.72.16 | France | 1.32k |
| 149.88.23.79 | Singapore | 1.29k |
That mix points toward distributed activity. Some IPs are clearly busy, but the traffic is not coming from one loud address that can be blamed and blocked with a dramatic flourish.
The operating system view was also odd:
| OS | Requests (24h) |
|---|---|
| MacOSX | 81.68k |
| Unknown/Others | 46.65k |
| Windows | 35.82k |
| Linux | 1.5k |
| iOS | 622 |
A huge MacOSX share can be real in some audiences, but combined with the other signals it looks more like user-agent shaping than a real room full of Mac users suddenly deciding this blog was the place to be.
The country breakdown had the same feeling:
| Country | Requests (24h) |
|---|---|
| United States | 38.36k |
| Vietnam | 28.08k |
| Singapore | 20.41k |
| France | 7.81k |
| Netherlands | 6.95k |
The spread is not impossible for real traffic, but it fits crawler infrastructure nicely. The United States, Singapore, France, and the Netherlands are all common places to see cloud, proxy, and hosting traffic. Vietnam also makes sense as normal local audience mixed into the noise.
The crawler-specific view had the same energy: AI crawlers were having a very busy day. Meta's external agent alone transferred 135.6 MB across 8.41k allowed requests, with Amazonbot and ClaudeBot also making thousands of allowed requests. That is not enormous internet-wide, but for a small static blog it is a lot of quiet page fetching.
| Crawler | Category | Bytes Transferred | Allowed Requests | Unsuccessful |
|---|---|---|---|---|
| Meta-ExternalAgent | Meta AI Crawler | 135.6 MB | 8.41k | 3 |
| Amazonbot | Amazon AI Crawler | 28.55 MB | 6.91k | 4 |
| ClaudeBot | Anthropic AI Crawler | 23.09 MB | 5.38k | 3 |
| Applebot | Apple AI Search | 4.56 MB | 783 | 6 |
| Bytespider | ByteDance AI Crawler | 2.34 MB | 166 | 7 |
| BingBot | Microsoft Search Crawler | 937.89 kB | 79 | 19 |
| ChatGPT-User | OpenAI AI Assistant | 334.96 kB | 39 | 4 |
| GPTBot | OpenAI AI Crawler | 288.18 kB | 24 | 10 |
| PetalBot | Huawei AI Crawler | 839.66 kB | 19 | 0 |
| TikTok Spider | ByteDance AI Crawler | 98.22 kB | 17 | 0 |
| OAI-SearchBot | OpenAI AI Search | 168.8 kB | 16 | 3 |
| Googlebot | Google Search Crawler | 140.51 kB | 12 | 2 |
| Perplexity-User | Perplexity AI Assistant | 93.87 kB | 8 | 9 |
| PerplexityBot | Perplexity AI Search | 63.98 kB | 8 | 9 |
| CCBot | Common Crawl AI Crawler | 47.8 kB | 8 | 7 |
| Claude-SearchBot | Anthropic AI Crawler | 33.57 kB | 8 | 7 |
| Google-CloudVertexBot | Google AI Crawler | 13.39 kB | 3 | 3 |
Several entries transferred only a few kilobytes, and some registered no traffic at all, so this was not one single AI bot explaining the whole spike. The important part was the pattern: many automated systems were active on the same day, and the heaviest ones were exactly the kind that want clean HTML text.
Why it did not look like real users
If a real audience surge caused the spike, I would expect at least a few human-looking clues:
- Referral spikes from search, social media, or a forum thread.
- Uneven traffic by hour.
- More variation in browsing depth.
- A few hot pages pulling most of the load.
Instead, the traffic looked rate-controlled. Something appeared to be fetching pages at a fixed pace, with occasional bursts layered on top. That fits crawlers, scrapers, or distributed ingestion jobs much better than people reading blog posts during lunch.
Another important detail: Cloudflare's unique visitor metric does not mean "humans who enjoyed my writing". If bots rotate IPs aggressively, the unique count can grow very quickly while the behavior is still fully automated.
The most likely explanation
The simplest explanation is still the best one: a crawler, or several crawlers, found the site and started sweeping it.
These were the most plausible possibilities:
- Large-scale bot crawling from search, SEO, or AI-related systems.
- Content scraping from systems that mirror programming content.
- Multiple independent crawlers overlapping and re-fetching the same pages.
- Light probing or scanning mixed into otherwise ordinary scraping traffic.
The site is a good crawler target. It is public, static, easy to parse, and full of structured technical content with headings, code blocks, and predictable URLs. If a system wants clean text without doing much JavaScript work, a site like this is convenient.
Why a 100-page blog can still generate hundreds of thousands of requests
This was the part that made me pause.
My site has roughly 100 pages. A well-behaved crawler should be able to fetch the whole thing in a few hundred requests, maybe a little more if it revisits some pages. Instead, the request count had climbed into the hundreds of thousands.
That only makes sense if the crawlers are not treating each article as one stable URL.
The biggest multiplier is query-string abuse. A page such as /post-1 can quickly turn into a long list of distinct URLs from the crawler's point of view:
/post-1?t=1
/post-1?t=2
/post-1?t=3If the cache key treats those as separate requests, a small site suddenly becomes a huge URL space. Add multiple crawlers, retries, distributed IPs, and weak deduplication, and the numbers stop looking so mysterious.
The blog was still small. The request space was not.
What actually mattered
The good news: this did not look like a destructive attack.
The more practical risk was cache bypass. If bots keep requesting HTML pages with different query strings and those variations miss cache, the crawler traffic stops being cheap CDN noise and starts creating unnecessary origin work.
That shifted the question from "Who are these visitors?" to "How do I make this traffic boring?"
For a static blog, boring is the goal.
How I would verify the diagnosis
If you see a similar spike on a static site, I would start with these checks in Cloudflare:
- Bot and human breakdown, if your plan exposes that data.
- Top countries and ASNs, especially data-center-heavy patterns.
- Top user agents, including empty or obviously synthetic ones.
- Whether the same paths keep appearing with many query-string variants.
- Cache hit ratio for HTML traffic.
- Whether one page is genuinely hot or the whole site is being crawled evenly.
The cleanest confirmation would be request logs showing repeated fetches of the same paths with changing query strings. At that point, the mystery is mostly over. You are probably not watching a viral moment. You are watching automated traffic manufacture new URLs faster than the site deserves.
What I took away from it
The weirdest part of this episode was how flattering the graph looked before it looked suspicious.
A sudden 10x spike is the sort of thing every personal site owner wants to believe for a moment. But the hourly shape told the real story almost immediately. Perfectly even growth is usually not audience growth. It is workload.
That turned out to be a useful reminder:
- Public technical content attracts crawlers because it is valuable and easy to process.
- "Unique visitors" in analytics can be a very optimistic label.
- For static sites, caching strategy matters more than guessing which bot is having a busy week.
If the origin is protected and HTML caching is configured properly, this kind of traffic is mostly an operational annoyance rather than a crisis.
Annoying is manageable. Exciting would have been nicer, of course, but manageable still wins.