Why My Blog Suddenly Had 70,000 Daily "Visitors"

Cloudflare suddenly reported more than 70,000 daily visitors on my small programming blog. The graph was exciting for about five seconds, then suspicious.

Cloudflare analytics recently gave me a tiny emotional roller coaster.

First thought: "Wait, did one of my posts finally escape into the wild?"

Second thought: "Why does this traffic look like it was scheduled?"

My blog usually sits around 7,000 daily unique visitors, and a fair chunk of that is already bot traffic. Then, over two days, Cloudflare started reporting more than 70,000 unique visitors per day. The hourly numbers were strangely flat, around 3,600 per hour.

That shape mattered more than the total. Big numbers are exciting. Flat numbers usually have something to say.

The pattern that gave it away

The suspicious part was not only the volume. It was how neatly the traffic behaved.

The jump happened almost instantly instead of building gradually.
The hourly traffic stayed unusually even.
Requests and bandwidth moved in a very steady, repeatable pattern.
Threat detections spiked only occasionally, so much of the traffic still looked valid enough to pass normal filtering.

Real human traffic is messy. It follows time zones, work hours, social links, search spikes, and random attention. It does not usually wake up one night and continue in a straight line.

Cloudflare Surge Traffic

Once I looked at the graph as a shape instead of a number, the likely explanation became much less glamorous: this was almost certainly automated crawling.

The detail panels made that feeling stronger. They did not show a normal reader profile slowly growing around a few popular articles. They showed a mostly HTML-heavy crawl, a strange client mix, and traffic spread across places that made more sense for infrastructure than for a sudden fan club.

First, the client HTTP version breakdown was very lopsided:

Version	Requests (24h)	Percentage of Requests
HTTP/1.1	137,088	81.87%
HTTP/2	23,424	13.99%
HTTP/3	6,929	4.14%
HTTP/1.0	9	0.005%

That does not prove anything on its own, but it is another little clue. Modern browsers happily use HTTP/2 or HTTP/3 when they can. A wave dominated by HTTP/1.1 feels more like automated clients, libraries, proxies, or mixed crawler infrastructure.

The content type breakdown was even clearer:

Type	Requests (24h)	Percentage of Requests
html	151,441	90.73%
empty	6,977	4.18%
js	3,428	2.05%
txt	1,621	1%
css	1,619	1%
unknown	795	0.5%
json	286	0.2%
webp	267	0.2%
xml	179	0.1%
svg	167	0.1%
png	92	0.06%
ico	38	0.02%

More than 90% of requests were HTML. That is not how a normal browser session feels on a web page with CSS, JavaScript, icons, images, and follow-up navigation. It is exactly what I would expect from something that mostly wants page text.

The top source IPs also showed repeated request volume from individual addresses, but not enough to explain the whole wave by themselves:

Source IP	Country	Requests (24h)
216.73.217.52	United States	4.69k
195.178.110.199	Netherlands	2.48k
45.148.10.249	Netherlands	1.94k
185.177.72.16	France	1.32k
149.88.23.79	Singapore	1.29k

That mix points toward distributed activity. Some IPs are clearly busy, but the traffic is not coming from one loud address that can be blamed and blocked with a dramatic flourish.

The operating system view was also odd:

OS	Requests (24h)
MacOSX	81.68k
Unknown/Others	46.65k
Windows	35.82k
Linux	1.5k
iOS	622

A huge MacOSX share can be real in some audiences, but combined with the other signals it looks more like user-agent shaping than a real room full of Mac users suddenly deciding this blog was the place to be.

The country breakdown had the same feeling:

Country	Requests (24h)
United States	38.36k
Vietnam	28.08k
Singapore	20.41k
France	7.81k
Netherlands	6.95k

The spread is not impossible for real traffic, but it fits crawler infrastructure nicely. The United States, Singapore, France, and the Netherlands are all common places to see cloud, proxy, and hosting traffic. Vietnam also makes sense as normal local audience mixed into the noise.

The crawler-specific view had the same energy: AI crawlers were having a very busy day. Meta's external agent alone transferred 135.6 MB across 8.41k allowed requests, with Amazonbot and ClaudeBot also making thousands of allowed requests. That is not enormous internet-wide, but for a small static blog it is a lot of quiet page fetching.

Crawler	Category	Bytes Transferred	Allowed Requests	Unsuccessful
Meta-ExternalAgent	Meta AI Crawler	135.6 MB	8.41k	3
Amazonbot	Amazon AI Crawler	28.55 MB	6.91k	4
ClaudeBot	Anthropic AI Crawler	23.09 MB	5.38k	3
Applebot	Apple AI Search	4.56 MB	783	6
Bytespider	ByteDance AI Crawler	2.34 MB	166	7
BingBot	Microsoft Search Crawler	937.89 kB	79	19
ChatGPT-User	OpenAI AI Assistant	334.96 kB	39	4
GPTBot	OpenAI AI Crawler	288.18 kB	24	10
PetalBot	Huawei AI Crawler	839.66 kB	19	0
TikTok Spider	ByteDance AI Crawler	98.22 kB	17	0
OAI-SearchBot	OpenAI AI Search	168.8 kB	16	3
Googlebot	Google Search Crawler	140.51 kB	12	2
Perplexity-User	Perplexity AI Assistant	93.87 kB	8	9
PerplexityBot	Perplexity AI Search	63.98 kB	8	9
CCBot	Common Crawl AI Crawler	47.8 kB	8	7
Claude-SearchBot	Anthropic AI Crawler	33.57 kB	8	7
Google-CloudVertexBot	Google AI Crawler	13.39 kB	3	3

Several entries transferred only a few kilobytes, and some registered no traffic at all, so this was not one single AI bot explaining the whole spike. The important part was the pattern: many automated systems were active on the same day, and the heaviest ones were exactly the kind that want clean HTML text.

Why it did not look like real users

If a real audience surge caused the spike, I would expect at least a few human-looking clues:

Referral spikes from search, social media, or a forum thread.
Uneven traffic by hour.
More variation in browsing depth.
A few hot pages pulling most of the load.

Instead, the traffic looked rate-controlled. Something appeared to be fetching pages at a fixed pace, with occasional bursts layered on top. That fits crawlers, scrapers, or distributed ingestion jobs much better than people reading blog posts during lunch.

Another important detail: Cloudflare's unique visitor metric does not mean "humans who enjoyed my writing". If bots rotate IPs aggressively, the unique count can grow very quickly while the behavior is still fully automated.

The most likely explanation

The simplest explanation is still the best one: a crawler, or several crawlers, found the site and started sweeping it.

These were the most plausible possibilities:

Large-scale bot crawling from search, SEO, or AI-related systems.
Content scraping from systems that mirror programming content.
Multiple independent crawlers overlapping and re-fetching the same pages.
Light probing or scanning mixed into otherwise ordinary scraping traffic.

The site is a good crawler target. It is public, static, easy to parse, and full of structured technical content with headings, code blocks, and predictable URLs. If a system wants clean text without doing much JavaScript work, a site like this is convenient.

Why a 100-page blog can still generate hundreds of thousands of requests

This was the part that made me pause.

My site has roughly 100 pages. A well-behaved crawler should be able to fetch the whole thing in a few hundred requests, maybe a little more if it revisits some pages. Instead, the request count had climbed into the hundreds of thousands.

That only makes sense if the crawlers are not treating each article as one stable URL.

The biggest multiplier is query-string abuse. A page such as /post-1 can quickly turn into a long list of distinct URLs from the crawler's point of view:

/post-1?t=1
/post-1?t=2
/post-1?t=3

If the cache key treats those as separate requests, a small site suddenly becomes a huge URL space. Add multiple crawlers, retries, distributed IPs, and weak deduplication, and the numbers stop looking so mysterious.

The blog was still small. The request space was not.

What actually mattered

The good news: this did not look like a destructive attack.

The more practical risk was cache bypass. If bots keep requesting HTML pages with different query strings and those variations miss cache, the crawler traffic stops being cheap CDN noise and starts creating unnecessary origin work.

That shifted the question from "Who are these visitors?" to "How do I make this traffic boring?"

For a static blog, boring is the goal.

How I would verify the diagnosis

If you see a similar spike on a static site, I would start with these checks in Cloudflare:

Bot and human breakdown, if your plan exposes that data.
Top countries and ASNs, especially data-center-heavy patterns.
Top user agents, including empty or obviously synthetic ones.
Whether the same paths keep appearing with many query-string variants.
Cache hit ratio for HTML traffic.
Whether one page is genuinely hot or the whole site is being crawled evenly.

The cleanest confirmation would be request logs showing repeated fetches of the same paths with changing query strings. At that point, the mystery is mostly over. You are probably not watching a viral moment. You are watching automated traffic manufacture new URLs faster than the site deserves.

What I took away from it

The weirdest part of this episode was how flattering the graph looked before it looked suspicious.

A sudden 10x spike is the sort of thing every personal site owner wants to believe for a moment. But the hourly shape told the real story almost immediately. Perfectly even growth is usually not audience growth. It is workload.

That turned out to be a useful reminder:

Public technical content attracts crawlers because it is valuable and easy to process.
"Unique visitors" in analytics can be a very optimistic label.
For static sites, caching strategy matters more than guessing which bot is having a busy week.

If the origin is protected and HTML caching is configured properly, this kind of traffic is mostly an operational annoyance rather than a crisis.

Annoying is manageable. Exciting would have been nicer, of course, but manageable still wins.

Bun's Rust Port and the New Reality of AI Engineering
A cautiously optimistic look at Bun's AI-assisted Rust port, and what it means for Zig, Rust, and the new reality of software work.
Deciding Browser Support for a Modern Website
Should your website support only recent browsers, or include older versions too? A practical way to weigh cost, reach, and user experience.
A Quality Checklist I Use for This Personal Website
A note-to-self for maintaining tuyen.blog, covering Precise Alloy, PageSpeed, WCAG, responsive support, offline access, security, themes, and print.
Install WSL for Development and Fix Common Problems on Windows
How to install WSL for development on Windows, set up Git and .NET, and reclaim disk space when the virtual disk grows larger than expected.