The Database That Deleted Itself: How Instagram Stored 25 Billion Photos on 3 Postgres Shards — Then Had to Rewrite Everything in 10 Months
In 2012, Instagram hit 80 million users and their Postgres database started deleting old data to stay alive. This is the story of the most desperate migration in social media history — and the architecture decisions that almost killed the fastest-growing app on Earth.
The Database That Deleted Itself: How Instagram Stored 25 Billion Photos on 3 Postgres Shards — Then Had to Rewrite Everything in 10 Months
It was 2:47 AM on a Tuesday in July 2012. Mike Krieger, Instagram's co-founder and CTO, was staring at his laptop screen in the dark. The graph looked like a hockey stick pointed at hell.
Instagram had crossed 80 million users. They were adding photos at a rate of 5.8 million per day. And their main Postgres database — the single source of truth for every photo, every like, every follower relationship — was running out of ID space.
Not storage space. Not CPU. Not memory.
ID space.
Postgres uses 32-bit integers for primary keys by default. That gives you 2.1 billion possible IDs. Instagram had burned through 1.8 billion. At their current growth rate, they'd hit the ceiling in 13 weeks. When that happened, the database would stop accepting new photos. The app would freeze. And 80 million people would blame them.
But the real nightmare wasn't the limit. It was what Instagram had already done to survive this long.
They'd started deleting old data.
The Architecture That Worked Until It Didn't
When Kevin Systrom and Mike Krieger launched Instagram in October 2010, they made a bet that haunts infrastructure engineers: start simple, scale later.
They ran on AWS. One Postgres database. No caching layer. No fancy distributed systems. Just a Python Django app hitting a single RDS instance with 32 GB of RAM.
And it worked. For exactly 12 hours.
On launch day, they got 25,000 signups. The database handled it. But by day three, they were at 100,000 users. Photos were taking 8 seconds to load. The Django app was timing out. Krieger barely slept for a week.
Their first scaling move was classic: vertical sharding. They split the monolithic database into logical shards — photos in one database, users in another, relationships (followers, likes) in a third. Each shard got its own Postgres instance. Each instance had 68 GB of RAM and 1,000 IOPS provisioned.
Problem solved.
For six months.
By April 2011, they had 5 million users. The photos shard alone was handling 40,000 writes per second at peak. A single Postgres instance, no matter how beefy, couldn't keep up. Writes were queuing. Replication lag was hitting 30 seconds. Users were uploading photos that didn't show up in their feed for half a minute.
So they did what every startup does when vertical scaling fails: horizontal sharding.
They split the photos table across multiple Postgres databases. But here's the problem with horizontal sharding: how do you generate unique IDs across multiple databases?
If you use Postgres' built-in auto-increment (SERIAL), every shard generates IDs independently. Shard 1 creates photo ID 1, 2, 3. Shard 2 creates photo ID 1, 2, 3. Collision. Chaos. Unacceptable.
Instagram needed a way to generate globally unique IDs across all shards, at a rate of 5 million per day, with zero collisions and near-zero latency.
The Snowflake Solution (That Almost Wasn't)
Krieger looked at Twitter's Snowflake — a distributed ID generation service that Twitter open-sourced in 2010. Snowflake generates 64-bit IDs that encode timestamp, datacenter ID, and machine ID. It's fast, distributed, and guarantees uniqueness.
But Instagram was 12 engineers. They didn't have time to run and maintain a separate service. They needed something inside Postgres.
So they built their own.
Instagram's ID generation scheme was elegant:
- 41 bits: milliseconds since epoch (gives you 69 years of IDs)
- 13 bits: logical shard ID (supports 8,192 shards)
- 10 bits: auto-incrementing sequence (1,024 IDs per millisecond per shard)
Total: 64 bits. Fits in a Postgres BIGINT.
They implemented it as a Postgres stored procedure. Every time you insert a photo, Postgres calls a PL/pgSQL function that:
- Gets the current timestamp in milliseconds
- Grabs the shard ID from configuration
- Increments a local sequence counter
- Bit-shifts and combines them into a single 64-bit integer
- Returns the ID
It was fast — sub-millisecond. It required no external dependencies. And it worked.
By May 2012, Instagram was running on 36 Postgres shards. They routed photo inserts by hashing the user ID modulo 36. Each shard handled roughly 2.2 million photos. Total storage: 80 billion rows across all tables.
They were generating 300,000 IDs per hour.
And that's when the original mistake came back to haunt them.
The Integer Apocalypse
Remember those first three shards? The ones they created in April 2011, before they implemented 64-bit IDs?
They were still using 32-bit auto-increment integers.
When Instagram built their Snowflake-style ID generation in May 2011, they applied it to new shards. But they didn't migrate the old ones. Migration meant downtime. It meant rewriting billions of rows. It meant risk.
So they let the old shards keep running on 32-bit IDs.
Big mistake.
By July 2012, those three legacy shards had each burned through 1.8 billion IDs. They had 13 weeks left before hitting the 2.1 billion ceiling.
And here's the worst part: you can't just change a primary key column type in Postgres without locking the entire table.
Altering a column from INTEGER to BIGINT requires Postgres to:
- Lock the table (no reads, no writes)
- Rewrite every single row
- Rebuild every index
- Update every foreign key reference
For a table with 6 billion rows and 20 indexes, that's 72 hours of downtime.
Three days where Instagram couldn't accept new photos. Three days where the app was frozen. Three days where they'd lose users, press coverage, and momentum.
Unacceptable.
So they came up with a plan that was equal parts genius and desperation.
The Migration Nobody Saw
The strategy was called shadow tables.
Here's how it worked:
- Create a new table with 64-bit IDs, identical schema to the old table
- Dual-write: Every new photo insert goes to BOTH the old table (32-bit) and the new table (64-bit)
- Backfill: Copy old data from the old table to the new table in batches, during off-peak hours
- Verify: Run data integrity checks to ensure old and new tables match
- Switch: Point the application at the new table, stop writing to the old one
- Delete: Drop the old table once you're confident
The beauty of this approach: zero downtime. The old table keeps serving reads and writes. The new table builds up in the background. When it's ready, you flip a config flag and move on.
But the devil is in the details.
Instagram had to:
- Write a Python script that copied rows in batches of 10,000 (too small = slow, too large = lock contention)
- Rate-limit the backfill to avoid overwhelming replication lag (they throttled to 5,000 rows/second)
- Handle edge cases where the same photo existed in both tables (deduplication)
- Ensure that foreign keys (likes, comments) pointed to the right table
- Test, test, test in staging before touching production
The backfill alone took 6 weeks per shard. They had three legacy shards. That's 18 weeks of continuous data migration, running 24/7 in the background, while the app kept growing.
Krieger barely slept. Every morning, he'd check the backfill progress:
- Shard 1: 4.2B / 6.1B rows (68%)
- Shard 2: 3.8B / 5.9B rows (64%)
- Shard 3: 5.1B / 6.3B rows (81%)
If anything went wrong — a replication lag spike, a disk failure, a config error — they'd have to start over.
And they were racing the clock. Every day, the old shards were burning through another 15 million IDs.
The Switch
On October 18, 2012, at 3 AM Pacific, Instagram flipped the switch.
They changed a single line in their Django config:
PHOTO_SHARD_MAP = {
'shard_001': 'photos_new', # was 'photos_old'
'shard_002': 'photos_new',
'shard_003': 'photos_new',
}
They deployed. Watched the logs. Held their breath.
No errors. Latency stayed flat at 12ms P99. Write throughput: 42,000 inserts/second. Replication lag: 0.4 seconds.
It worked.
The old shards, now sitting at 2.04 billion IDs, were retired. The new shards, with 64-bit ID space, could handle 9.2 quintillion photos.
At Instagram's growth rate (5 million photos/day in 2012), that's enough for 5 million years.
The Lessons That Scaled to a Billion
By 2013, Instagram had 150 million users and 16 billion photos. They'd scaled to 100+ Postgres shards, with plans to move to Cassandra for long-term storage.
But the lessons from the 32-bit crisis shaped everything that came after:
1. Future-proof your data types early. Use 64-bit IDs from day one. Use BIGINT, not INTEGER. The cost of migration later is 1000x higher than the cost of an extra 4 bytes per row.
2. Schema migrations at scale require zero-downtime strategies. Shadow tables, dual writes, and backfills are the only way to change a production database with billions of rows.
3. Postgres can scale further than you think. Instagram ran on Postgres until 2014 (4 years, 300M users). With sharding, read replicas, and connection pooling, a well-tuned Postgres cluster can handle 100K+ writes/second.
4. The fastest code is the code you don't run. Instagram's ID generation was a PL/pgSQL function, not a microservice. No network hop. No serialization. No external dependency. Just a stored procedure.
5. Monitor your future, not just your present. Krieger's 2 AM panic wasn't triggered by high latency or OOM errors. It was triggered by a graph showing ID exhaustion 13 weeks away. The best alerts predict disaster before it happens.
Today, Instagram serves 2 billion users and hosts over 100 billion photos. They've since moved much of their data to Cassandra and TAO (Facebook's graph store). But the core architecture — Postgres shards with 64-bit Snowflake IDs — still powers the system that onboards 500,000 new users per day.
And somewhere in Instagram's codebase, there's probably still a comment that says:
-- Never use INTEGER for IDs. Ever. Seriously. We learned this the hard way.
The Architecture That Almost Killed Instagram
The irony is that Instagram's near-death experience came not from load, not from traffic, not from a DDoS attack — but from a data type.
A single architectural decision made in April 2011, under pressure, with 5 million users breathing down their necks.
They chose INTEGER instead of BIGINT. 32 bits instead of 64. And it almost cost them everything.
In system design, the decisions you make at 5 million users echo at 500 million. The shortcuts you take under pressure become the migrations you dread at scale.
Instagram's ID crisis is a reminder: the most dangerous outages are the ones you schedule 18 months in advance.
Keep Reading
The 200-Millisecond Symphony: How Daniel Ek Built Spotify on 2,000 Microservices While the Music Industry Called Him a Pirate
You press play. 200 milliseconds later, music floods your ears. Behind that tap lies 2,000+ microservices, a recommendation engine trained on 4 billion playlist operations, and the story of a Swedish founder who built the architecture to serve 100 million songs while paying $0.003 per stream.
The 50-Engineer Army That Beat Silicon Valley: How Jan Koum Built WhatsApp on a Telecom Language From 1986 — And Made $19 Billion Saying 'No'
In 2014, WhatsApp served 900 million users with just 50 engineers — a ratio that made Facebook's 10,000 employees look inefficient. The secret? A programming language built for telephone switches, a CEO who grew up on food stamps, and an architecture so elegant it broke every Silicon Valley rule.
The Algorithm That Lets Two People Type in the Same Cell — And Why Google's 200ms Magic Nearly Broke Physics
You're typing in cell B4. So is your coworker. Neither of you crashes, overwrites, or loses data. That shouldn't be possible — but it is, thanks to a mathematical breakthrough from Xerox PARC and a war between two competing algorithms that power every collaborative doc on the internet.