🏗️System DesignMay 8, 2026 at 8:29 AM·9 min read

The Database That Deleted Itself: How Instagram Stored 25 Billion Photos on 3 Postgres Shards — Then Had to Rewrite Everything in 10 Months

In 2012, Instagram hit 80 million users and their Postgres database started deleting old data to stay alive. This is the story of the most desperate migration in social media history — and the architecture decisions that almost killed the fastest-growing app on Earth.

InstagramSystem DesignPostgresDistributed SystemsDatabase MigrationShardingArchitectureInfrastructure

The Database That Deleted Itself: How Instagram Stored 25 Billion Photos on 3 Postgres Shards — Then Had to Rewrite Everything in 10 Months

It was 2:47 AM on a Tuesday in July 2012. Mike Krieger, Instagram's co-founder and CTO, was staring at his laptop screen in the dark. The graph looked like a hockey stick pointed at hell.

Instagram had crossed 80 million users. They were adding photos at a rate of 5.8 million per day. And their main Postgres database — the single source of truth for every photo, every like, every follower relationship — was running out of ID space.

Not storage space. Not CPU. Not memory.

ID space.

Postgres uses 32-bit integers for primary keys by default. That gives you 2.1 billion possible IDs. Instagram had burned through 1.8 billion. At their current growth rate, they'd hit the ceiling in 13 weeks. When that happened, the database would stop accepting new photos. The app would freeze. And 80 million people would blame them.

But the real nightmare wasn't the limit. It was what Instagram had already done to survive this long.

They'd started deleting old data.

The Architecture That Worked Until It Didn't

When Kevin Systrom and Mike Krieger launched Instagram in October 2010, they made a bet that haunts infrastructure engineers: start simple, scale later.

They ran on AWS. One Postgres database. No caching layer. No fancy distributed systems. Just a Python Django app hitting a single RDS instance with 32 GB of RAM.

And it worked. For exactly 12 hours.

On launch day, they got 25,000 signups. The database handled it. But by day three, they were at 100,000 users. Photos were taking 8 seconds to load. The Django app was timing out. Krieger barely slept for a week.

Their first scaling move was classic: vertical sharding. They split the monolithic database into logical shards — photos in one database, users in another, relationships (followers, likes) in a third. Each shard got its own Postgres instance. Each instance had 68 GB of RAM and 1,000 IOPS provisioned.

Problem solved.

For six months.

By April 2011, they had 5 million users. The photos shard alone was handling 40,000 writes per second at peak. A single Postgres instance, no matter how beefy, couldn't keep up. Writes were queuing. Replication lag was hitting 30 seconds. Users were uploading photos that didn't show up in their feed for half a minute.

So they did what every startup does when vertical scaling fails: horizontal sharding.

They split the photos table across multiple Postgres databases. But here's the problem with horizontal sharding: how do you generate unique IDs across multiple databases?

If you use Postgres' built-in auto-increment (SERIAL), every shard generates IDs independently. Shard 1 creates photo ID 1, 2, 3. Shard 2 creates photo ID 1, 2, 3. Collision. Chaos. Unacceptable.

Instagram needed a way to generate globally unique IDs across all shards, at a rate of 5 million per day, with zero collisions and near-zero latency.

The Snowflake Solution (That Almost Wasn't)

Krieger looked at Twitter's Snowflake — a distributed ID generation service that Twitter open-sourced in 2010. Snowflake generates 64-bit IDs that encode timestamp, datacenter ID, and machine ID. It's fast, distributed, and guarantees uniqueness.

But Instagram was 12 engineers. They didn't have time to run and maintain a separate service. They needed something inside Postgres.

So they built their own.

Instagram's ID generation scheme was elegant:

41 bits: milliseconds since epoch (gives you 69 years of IDs)
13 bits: logical shard ID (supports 8,192 shards)
10 bits: auto-incrementing sequence (1,024 IDs per millisecond per shard)

Total: 64 bits. Fits in a Postgres BIGINT.

They implemented it as a Postgres stored procedure. Every time you insert a photo, Postgres calls a PL/pgSQL function that:

Gets the current timestamp in milliseconds
Grabs the shard ID from configuration
Increments a local sequence counter
Bit-shifts and combines them into a single 64-bit integer
Returns the ID

It was fast — sub-millisecond. It required no external dependencies. And it worked.

By May 2012, Instagram was running on 36 Postgres shards. They routed photo inserts by hashing the user ID modulo 36. Each shard handled roughly 2.2 million photos. Total storage: 80 billion rows across all tables.

They were generating 300,000 IDs per hour.

And that's when the original mistake came back to haunt them.

The Integer Apocalypse

Remember those first three shards? The ones they created in April 2011, before they implemented 64-bit IDs?

They were still using 32-bit auto-increment integers.

When Instagram built their Snowflake-style ID generation in May 2011, they applied it to new shards. But they didn't migrate the old ones. Migration meant downtime. It meant rewriting billions of rows. It meant risk.

So they let the old shards keep running on 32-bit IDs.

Big mistake.

By July 2012, those three legacy shards had each burned through 1.8 billion IDs. They had 13 weeks left before hitting the 2.1 billion ceiling.

And here's the worst part: you can't just change a primary key column type in Postgres without locking the entire table.

Altering a column from INTEGER to BIGINT requires Postgres to:

Lock the table (no reads, no writes)
Rewrite every single row
Rebuild every index
Update every foreign key reference

For a table with 6 billion rows and 20 indexes, that's 72 hours of downtime.

Three days where Instagram couldn't accept new photos. Three days where the app was frozen. Three days where they'd lose users, press coverage, and momentum.

Unacceptable.

So they came up with a plan that was equal parts genius and desperation.

The Migration Nobody Saw

The strategy was called shadow tables.

Here's how it worked:

Create a new table with 64-bit IDs, identical schema to the old table
Dual-write: Every new photo insert goes to BOTH the old table (32-bit) and the new table (64-bit)
Backfill: Copy old data from the old table to the new table in batches, during off-peak hours
Verify: Run data integrity checks to ensure old and new tables match
Switch: Point the application at the new table, stop writing to the old one
Delete: Drop the old table once you're confident

The beauty of this approach: zero downtime. The old table keeps serving reads and writes. The new table builds up in the background. When it's ready, you flip a config flag and move on.

But the devil is in the details.

Instagram had to:

Write a Python script that copied rows in batches of 10,000 (too small = slow, too large = lock contention)
Rate-limit the backfill to avoid overwhelming replication lag (they throttled to 5,000 rows/second)
Handle edge cases where the same photo existed in both tables (deduplication)
Ensure that foreign keys (likes, comments) pointed to the right table
Test, test, test in staging before touching production

The backfill alone took 6 weeks per shard. They had three legacy shards. That's 18 weeks of continuous data migration, running 24/7 in the background, while the app kept growing.

Krieger barely slept. Every morning, he'd check the backfill progress:

Shard 1: 4.2B / 6.1B rows (68%)
Shard 2: 3.8B / 5.9B rows (64%)
Shard 3: 5.1B / 6.3B rows (81%)

If anything went wrong — a replication lag spike, a disk failure, a config error — they'd have to start over.

And they were racing the clock. Every day, the old shards were burning through another 15 million IDs.

The Switch

On October 18, 2012, at 3 AM Pacific, Instagram flipped the switch.

They changed a single line in their Django config:

PHOTO_SHARD_MAP = {
    'shard_001': 'photos_new',  # was 'photos_old'
    'shard_002': 'photos_new',
    'shard_003': 'photos_new',
}

They deployed. Watched the logs. Held their breath.

No errors. Latency stayed flat at 12ms P99. Write throughput: 42,000 inserts/second. Replication lag: 0.4 seconds.

It worked.

The old shards, now sitting at 2.04 billion IDs, were retired. The new shards, with 64-bit ID space, could handle 9.2 quintillion photos.

At Instagram's growth rate (5 million photos/day in 2012), that's enough for 5 million years.

The Lessons That Scaled to a Billion

By 2013, Instagram had 150 million users and 16 billion photos. They'd scaled to 100+ Postgres shards, with plans to move to Cassandra for long-term storage.

But the lessons from the 32-bit crisis shaped everything that came after:

1. Future-proof your data types early. Use 64-bit IDs from day one. Use BIGINT, not INTEGER. The cost of migration later is 1000x higher than the cost of an extra 4 bytes per row.

2. Schema migrations at scale require zero-downtime strategies. Shadow tables, dual writes, and backfills are the only way to change a production database with billions of rows.

3. Postgres can scale further than you think. Instagram ran on Postgres until 2014 (4 years, 300M users). With sharding, read replicas, and connection pooling, a well-tuned Postgres cluster can handle 100K+ writes/second.

4. The fastest code is the code you don't run. Instagram's ID generation was a PL/pgSQL function, not a microservice. No network hop. No serialization. No external dependency. Just a stored procedure.

5. Monitor your future, not just your present. Krieger's 2 AM panic wasn't triggered by high latency or OOM errors. It was triggered by a graph showing ID exhaustion 13 weeks away. The best alerts predict disaster before it happens.

Today, Instagram serves 2 billion users and hosts over 100 billion photos. They've since moved much of their data to Cassandra and TAO (Facebook's graph store). But the core architecture — Postgres shards with 64-bit Snowflake IDs — still powers the system that onboards 500,000 new users per day.

And somewhere in Instagram's codebase, there's probably still a comment that says:

-- Never use INTEGER for IDs. Ever. Seriously. We learned this the hard way.

The Architecture That Almost Killed Instagram

The irony is that Instagram's near-death experience came not from load, not from traffic, not from a DDoS attack — but from a data type.

A single architectural decision made in April 2011, under pressure, with 5 million users breathing down their necks.

They chose INTEGER instead of BIGINT. 32 bits instead of 64. And it almost cost them everything.

In system design, the decisions you make at 5 million users echo at 500 million. The shortcuts you take under pressure become the migrations you dread at scale.

Instagram's ID crisis is a reminder: the most dangerous outages are the ones you schedule 18 months in advance.

✍️

Written by Swayam Mohanty

Untold stories behind the tech giants, legendary moments, and the code that changed the world.

Keep Reading

The 3am Query That Cost $500 Million: How Airbnb's Database Fell Over During the Super Bowl — And Why Joe Gebbia Rewrote Search in 9 Days

🏗️ system design

9 min read

The 3am Query That Cost $500 Million: How Airbnb's Database Fell Over During the Super Bowl — And Why Joe Gebbia Rewrote Search in 9 Days

At 3:17am on February 2, 2014, Airbnb's entire search infrastructure collapsed under 40,000 queries per second. The culprit? A single JOIN clause that scanned 200 million rows every time someone typed 'San Francisco.'

AirbnbSystem Design+20

Jul 19

The 6-Second Rule That Saved Gmail: How Paul Buchheit Bet Google's Entire Search Index on a Crazy Disk Storage Trick — And Invented the '1GB Free' Email Revolution

🏗️ system design

9 min read

The 6-Second Rule That Saved Gmail: How Paul Buchheit Bet Google's Entire Search Index on a Crazy Disk Storage Trick — And Invented the '1GB Free' Email Revolution

In 2004, Google's engineers declared it impossible to give away gigabytes of storage for free. Then Paul Buchheit showed them an 11-line algorithm that changed email forever — and terrified Microsoft so badly they tripled Hotmail's storage overnight.

GmailPaul Buchheit+21

Jul 14

The 4am Phone Call That Saved a Billion Dollars: How Pinterest's Engineers Discovered Their Database Was Writing to Disk 40 Million Times a Second — And Rewrote Their Entire Architecture in 6 Weeks

🏗️ system design

10 min read

The 4am Phone Call That Saved a Billion Dollars: How Pinterest's Engineers Discovered Their Database Was Writing to Disk 40 Million Times a Second — And Rewrote Their Entire Architecture in 6 Weeks

In December 2011, Pinterest's servers were melting down. Every pin, every save, every scroll was writing to disk millions of times. Then Yashwanth Nelapati opened MySQL's slow query log at 4am — and what he found changed everything.

PinterestSystem Design+17

Jul 13