The Spam Tracker That Became the Internet's Bodyguard: How Two Harvard MBAs Turned a Side Project Into the Infrastructure Layer Protecting Half the Web
๐Ÿš€Origin StoriesMarch 24, 2026 at 8:13 AMยท10 min read

The Spam Tracker That Became the Internet's Bodyguard: How Two Harvard MBAs Turned a Side Project Into the Infrastructure Layer Protecting Half the Web

Matthew Prince and Michelle Zatlyn started tracking email spammers as a classroom project. Five years later, they were blocking the largest DDoS attack in history โ€” and AWS, Akamai, and Fastly were scrambling to catch up.

CloudflareInfrastructureCDNDDoSEdge ComputingSystem DesignOrigin StoriesMatthew Prince

The Dataset Nobody Wanted

It was 2004, and Matthew Prince had a problem that wasn't supposed to be a problem. He was at Harvard Business School studying entrepreneurship, running a side project called Project Honey Pot โ€” a distributed network that tracked email spammers and comment spam across thousands of websites. The data was extraordinary: millions of IP addresses, behavior patterns, threat signatures from across the internet.

He pitched it to venture capitalists. "You have a spam tracking system," they said, nodding politely. "Who's going to pay for spam data?"

Nobody, it turned out. Project Honey Pot remained a side project, a fascinating dataset with no business model. Prince graduated, became a lawyer specializing in internet and cybersecurity law, and kept the servers running as a hobby.

But he couldn't shake a nagging thought: What if we're looking at this backwards?

Project Honey Pot wasn't just collecting spam data. It was seeing every attack โ€” the DDoS botnets, the SQL injection attempts, the credential stuffing, the emerging threats that traditional security companies wouldn't see for weeks. They weren't sitting on spam data. They were sitting on a real-time threat intelligence network that covered a massive chunk of the internet's attack surface.

In 2009, Prince reconnected with Michelle Zatlyn, a classmate from Harvard Business School who'd gone into product management. Over coffee in Palo Alto, he showed her the data. "Every website owner needs this," he said. "But nobody knows they need it until after they get attacked."

Zatlyn leaned forward. "What if we just gave it to them for free?"

The TechCrunch Disrupt Gambit

September 27, 2010. The Pier 48 conference hall in San Francisco was packed for TechCrunch Disrupt's Startup Battlefield โ€” the competition where young companies got seven minutes on stage to pitch their product to the tech press and investor elite.

Prince walked out in a button-down shirt and jeans. No slides. Just a live demo.

"Show of hands," he said. "Who here runs a website?"

Half the room raised their hands.

"Now keep your hand up if your website has been attacked in the last year. DDoS, comment spam, SQL injection, anything."

The hands stayed up.

"Here's the thing," Prince continued, pulling up a browser. "Enterprise companies pay Akamai $100,000 a year for DDoS protection and CDN services. Small businesses get nothing. We're launching Cloudflare today โ€” and we're giving you the same protection, for free."

He typed a domain name into the Cloudflare signup page. Four clicks later, the site was protected. "That's it," he said. "Change your DNS. Everything routes through us. We filter the attacks, cache your static content at our edge servers, and deliver your site faster than it's ever been."

The room went silent. Then someone in the front row said what everyone was thinking: "How is this free?"

"Freemium," Prince replied. "Free for individuals. Paid plans for businesses that need more โ€” advanced DDoS, dedicated support, higher-tier SSL. But the core product? Always free."

Cloudflare won the Startup Battlefield that day. By the end of the week, they had 10,000 signups.

The Anycast Epiphany

The secret to Cloudflare's speed wasn't just caching. It was Anycast routing โ€” a networking technique that most CDN companies used sparingly, but Cloudflare bet the entire company on.

Here's how it works: traditional CDNs use Unicast, where each data center has a unique IP address. When you request a webpage, your ISP routes you to the closest server based on a DNS lookup. It works, but it's slow โ€” DNS propagation takes time, and if the nearest server is overloaded, you're stuck.

Anycast flips this. Every Cloudflare data center announces the same IP address. When you make a request, the internet's BGP routing protocols automatically send it to the topologically closest server โ€” usually the one with the lowest latency. No DNS lookup delays. No overloaded servers. Just instant, automatic routing to the fastest available node.

By 2012, Cloudflare had 14 data centers. By 2015, they had 60. By 2023, they had over 300 locations in more than 100 countries โ€” the largest Anycast network on the planet. Every single HTTP request to a Cloudflare-protected site was answered by a server within 50 milliseconds of the user.

AWS CloudFront had more total capacity. Akamai had more enterprise contracts. But neither could match Cloudflare's combination of density, speed, and price.

The Attack That Changed Everything

February 2020. A Cloudflare customer โ€” a small business selling artisanal goods online โ€” got hit with a 2.3 Tbps DDoS attack. For context, that's terabits per second โ€” enough traffic to overwhelm most regional ISPs.

The attack lasted 30 seconds. The customer's site never went down.

Cloudflare's automatic mitigation system detected the anomaly in milliseconds, flagged the malicious traffic, and distributed the filtering load across 100+ data centers. The attacker gave up. The customer never even knew it happened โ€” they only found out when Cloudflare sent them a post-incident report.

By 2023, Cloudflare was blocking attacks that made 2020 look quaint: 71 million requests per second, sustained for hours, targeting a customer in the Asia-Pacific region. The attack used a record-breaking HTTP/2 Rapid Reset vulnerability, exploiting a flaw in how web servers handle connection resets.

Cloudflare's team deployed a mitigation patch while the attack was happening, pushing updates to 300+ data centers in under 10 minutes. The customer's site stayed online.

No other CDN could move that fast. The legacy giants โ€” Akamai, Level 3 โ€” required manual intervention for attacks this large. AWS CloudFront could handle the traffic, but the customer would've gotten a bill that looked like a phone number.

Cloudflare didn't charge extra. It was covered under the standard plan.

The Edge Revolution: Workers, R2, and the War on AWS

By 2017, Cloudflare had won the CDN game. Millions of websites were routing through their network. But Prince and Zatlyn saw a bigger opportunity: edge computing.

AWS Lambda had proven that serverless functions were the future โ€” no servers to manage, pay-per-execution pricing, infinite scale. But Lambda had a fatal flaw: cold starts. The first request to a Lambda function could take 500ms or more as AWS spun up a new container. For latency-sensitive applications, that was unacceptable.

Cloudflare Workers launched with a radical architectural decision: no containers. Instead of spinning up a Node.js container for each function (like Lambda), Workers ran JavaScript inside V8 isolates โ€” lightweight execution contexts that shared the same runtime. Cold start time? Zero milliseconds. Deploy time? Seconds.

The developer experience was intoxicating. You wrote a function in JavaScript (later: TypeScript, Python, Rust via WASM), ran wrangler deploy, and your code was live on 300+ edge locations globally. No YAML configurations, no IAM role hell, no waiting for CloudFormation stacks.

Then came R2 in 2022 โ€” Cloudflare's answer to AWS S3. The pricing was the real story: zero egress fees. AWS charges you every time data leaves S3 โ€” a cost that could run into thousands of dollars for high-traffic applications. R2 charged only for storage and write operations. For media-heavy sites and CDNs, this was a 10x cost reduction.

The message was clear: Cloudflare wasn't just competing with Akamai anymore. They were coming for AWS's core infrastructure business.

The Nginx Rewrite: Pingora and the Rust Revolution

By 2022, Cloudflare was handling over 46 million HTTP requests per second globally. Their entire reverse proxy layer โ€” the core software that routed, cached, and secured every request โ€” was built on Nginx, the open-source web server that powered half the internet.

But Nginx had limits. It was written in C, prone to memory safety bugs. It was single-threaded, inefficient for modern multi-core servers. And it wasn't designed for Cloudflare's scale โ€” custom Lua modules, complex routing logic, attack mitigation filters stacked on top of each other.

So Cloudflare did something audacious: they replaced Nginx entirely.

Enter Pingora โ€” a custom reverse proxy written from scratch in Rust. Memory-safe by default. Multi-threaded and asynchronous. Purpose-built for Cloudflare's needs: HTTP/2, HTTP/3, QUIC support, native integration with their threat intelligence system, sub-millisecond routing decisions.

The migration took two years. By 2024, Cloudflare's entire global network ran on Pingora. The results: 70% reduction in CPU usage, 67% reduction in memory, faster request handling, and zero memory-safety vulnerabilities.

No other CDN had pulled off a rewrite this ambitious. Fastly was still on Varnish. Akamai was locked into proprietary C codebases. AWS CloudFront was an impenetrable black box.

Cloudflare had become the only major infrastructure company with a Rust-native proxy at its core.

The Competitors Scrambling to Catch Up

AWS CloudFront has the capacity โ€” backed by Amazon's global infrastructure. But it's complex, expensive, and designed for enterprise customers with dedicated DevOps teams. Cloudflare's freemium model made it accessible to anyone.

Akamai invented the CDN in 1998 and still dominates Fortune 500 contracts. But they're legacy โ€” slow to innovate, expensive, and losing developer mindshare to Cloudflare's API-first approach.

Fastly is the developer darling, powering Shopify, Stripe, and GitHub. They have Compute@Edge (their answer to Workers), but their network is smaller (70 cities vs 300+), and their pricing can't compete with Cloudflare's free tier.

Vercel and Netlify overlap on edge functions and static site hosting, but they're focused on frontend developers. Cloudflare is playing a bigger game โ€” full-stack infrastructure, databases (D1), queues, AI inference at the edge.

The Uncomfortable Questions

In September 2019, Cloudflare terminated 8chan, the anonymous imageboard linked to multiple mass shootings. CEO Matthew Prince wrote a blog post titled "Terminating Service for 8Chan" โ€” but also expressed discomfort: "I think the power Cloudflare has is a problem. We're a private company making decisions about who gets to speak online."

In 2022, Cloudflare terminated Kiwi Farms, a forum linked to harassment campaigns. Again, Prince agonized publicly about the decision.

Critics asked: Should one company control this much of the internet's infrastructure?

Cloudflare protects roughly 20% of the top 10 million websites. If Cloudflare goes down (as it did in June 2019, July 2020, and June 2022), massive chunks of the internet become unreachable.

Then there's vendor lock-in. Cloudflare Workers use a proprietary API โ€” it's not compatible with AWS Lambda or Deno Deploy. Once you build on Workers, migrating off is painful.

The Legacy

As of 2024, Cloudflare processes over 57 million HTTP requests per second, blocks 182 billion cyber threats per day, and runs infrastructure in 300+ cities. They've gone from a spam-tracking side project to a $30 billion public company.

Matthew Prince and Michelle Zatlyn didn't invent the CDN. They didn't invent DDoS protection. But they did something harder: they made enterprise-grade infrastructure accessible.

They proved that you could give away the core product for free and still build a billion-dollar business. They proved that a freemium infrastructure company could scale faster than AWS. They proved that developers would bet their startups on a platform that prioritized speed, simplicity, and price over raw AWS-style power.

And in doing so, they turned a Harvard Business School side project into the bodyguard of half the internet.

Every DDoS attack blocked, every millisecond of latency saved, every startup that scales without worrying about infrastructure costs โ€” it all traces back to a dataset nobody wanted and two entrepreneurs who asked: What if we just gave it away?

โœ๏ธ
Written by Swayam Mohanty
Untold stories behind the tech giants, legendary moments, and the code that changed the world.

Keep Reading