The Algorithm That Lets Two People Type in the Same Cell — And Why Google's 200ms Magic Nearly Broke Physics
You're typing in cell B4. So is your coworker. Neither of you crashes, overwrites, or loses data. That shouldn't be possible — but it is, thanks to a mathematical breakthrough from Xerox PARC and a war between two competing algorithms that power every collaborative doc on the internet.
The Algorithm That Lets Two People Type in the Same Cell — And Why Google's 200ms Magic Nearly Broke Physics
It's 2:47 PM on a Tuesday. You're in a Google Sheet, typing a formula into cell B4. Halfway through, you notice the cursor. Another cursor. Your coworker Sarah, 3,000 miles away in San Francisco, is typing in the same exact cell.
You both hit Enter.
Neither of you crashes. Neither overwrites the other. The formula merges perfectly — her function name, your arguments, everything intact. You blink. Sarah sends a Slack: "Did you just see that?"
You did. And you have no idea how it worked.
Welcome to the most elegant, brain-bending problem in distributed systems: real-time collaborative editing. The magic that makes Google Docs, Sheets, Figma, and Notion feel like sorcery is powered by algorithms most engineers have never heard of — algorithms born in a Xerox PARC lab in 1995, refined through a decade-long war between two competing mathematical approaches, and implemented in production systems handling billions of keystrokes per day.
This is the story of Operational Transformation (OT) and Conflict-free Replicated Data Types (CRDTs) — two radically different solutions to the same impossible problem. One requires a central brain to coordinate every edit. The other promises a peer-to-peer utopia where servers are optional. Both have been deployed at Google, Figma, and inside the editors you use every day.
And both, in their own way, are miracles.
The Jupiter Problem: When Time Breaks Down
The year is 1995. The web is barely crawling. At Xerox PARC — the same lab that invented the GUI, Ethernet, and the laser printer — two researchers, David Nichols and David Curtis, are staring at a whiteboard covered in arrows.
They're building Project Jupiter, a system for collaborative text editing over the internet. The problem seems simple: let two people edit the same document at the same time without destroying each other's work.
But distributed systems don't do "simple."
Imagine two users, Alice and Bob, editing the word "cat":
- Alice is at position 0, about to insert "s" to make "scat"
- Bob is at position 3, about to insert "s" to make "cats"
- Network latency is 200ms
Alice types "s". Her editor shows "scat". She sends the operation to the server: insert(0, 's').
Bob types "s". His editor shows "cats". He sends: insert(3, 's').
The server receives both operations. It applies them in order.
First, Alice's: "cat" → "scat" (insert at 0)
Then, Bob's: "scat" → "scats" (insert at position 3)
But wait. Bob's operation said position 3 — which in his world meant "after cat". But in the server's world, after Alice's operation, position 3 is in the middle of "scat".
The result? "scats" — which is neither what Alice saw ("scat") nor what Bob saw ("cats").
Positions don't work. Timestamps don't work. Last-write-wins destroys data.
This is the OT puzzle: how do you transform an operation that was created in one document state so it makes sense in a different document state — after concurrent edits you didn't know about?
Nichols and Curtis spent two years on this. The answer they found — Operational Transformation — would power Google Wave, Google Docs, and eventually Google Sheets.
The Transform Function: Teaching Operations to Adapt
Operational Transformation works on a deceptively simple idea: don't replay operations as-is. Transform them first.
Every edit is an operation — insert, delete, retain. When two operations happen concurrently (meaning neither user knew about the other), you run them through a transform function that adjusts their positions and offsets so they compose correctly.
Here's the magic:
transform(insert(0, 's'), insert(3, 's')) =
(insert(0, 's'), insert(4, 's'))
The transform function sees Alice's insert at position 0 and realizes Bob's insert at position 3 needs to shift right by 1 to account for the new character. It returns two new operations:
- Alice's operation stays the same:
insert(0, 's') - Bob's operation shifts:
insert(4, 's')
Apply them both, and you get "scats" — which is still wrong.
But here's the trick: each client applies the transformation too. Alice's editor receives Bob's operation, transforms it against her own pending operation, and applies the transformed version. Bob does the same.
The result:
- Alice sees: "cat" → "scat" → "scats" (Bob's insert transformed to position 4)
- Bob sees: "cat" → "cats" → "scats" (Alice's insert transformed to position 0)
- Server sees: "cat" → "scat" → "scats" (applies both after transformation)
Eventual consistency achieved. Three independent machines, three different timelines, one final state.
This is OT. And it's wildly complex.
The Server That Holds the Universe Together
Here's the catch: OT only works if there's a central server that decides the canonical order of operations.
Why? Because transformations depend on order. If Alice's operation is transformed against Bob's, but Bob's is transformed against Charlie's, and Charlie's against Alice's, you get a cycle. The transform functions conflict. The document diverges.
So Google Docs does this:
- Every client sends operations to a central server
- The server sequences them in a total order (using vector clocks or lamport timestamps)
- The server broadcasts the ordered stream to all clients
- Each client transforms incoming operations against its own pending (not-yet-acknowledged) ops
The server is the source of truth. It decides what happened first. Without it, OT collapses.
This is why Google Docs requires a server. You can't edit offline and merge later (well, you can, but it's a nightmare). You can't go peer-to-peer. The server is the god that prevents the universe from forking.
In 2010, Google released Google Wave, an ambitious real-time collaboration platform powered by OT. The protocol spec was 87 pages long. The reference implementation had bugs that took years to find. One engineer described it as "the most correct code I've ever written that also broke constantly."
Wave died in 2012.
But OT survived — refined, debugged, and deployed in Google Docs and Sheets, where it handles 1.5 billion operations per day.
The Rebel Math: What If You Didn't Need a Server?
While Google was wrestling with OT, a different group of researchers was asking a heretical question:
What if you could guarantee consistency without a central server at all?
Enter CRDTs — Conflict-free Replicated Data Types.
The idea is deceptively elegant: instead of transforming operations, design your data structure so that all operations commute. Make it mathematically impossible to have a conflict.
Example: a Last-Write-Wins Register (LWW-Register).
- Every write is tagged with a timestamp (or a unique ID)
- When two writes conflict, the one with the higher timestamp wins
- No transformation needed. No central server. Just merge, and you're guaranteed to converge.
But LWW-Registers are too simple. They only store one value. What about text? Text is a sequence — you need to track insertions, deletions, concurrent edits at different positions.
This is where sequence CRDTs come in.
The YATA Algorithm: A Text Editor Without a Server
In 2016, a developer named Kevin Jahns built Yjs, a CRDT library for real-time collaboration. At its core is an algorithm called YATA (Yet Another Transformation Approach — a cheeky nod to OT).
Here's how it works:
Every character in the document gets a unique identifier — a combination of:
- Client ID (who inserted it)
- Logical clock (when it was inserted, in that client's timeline)
- Position metadata (what comes before/after)
When Alice inserts "s" at position 0 in "cat", she creates an ID like (Alice, 1, originLeft: null).
When Bob inserts "s" at position 3 in "cat", he creates (Bob, 1, originLeft: 't').
When they sync, each client merges the two lists of characters. The CRDT uses the IDs to figure out the intended position — even though the operations happened concurrently.
The result: "scats" — same as OT.
But here's the magic: no server needed. Alice and Bob can sync peer-to-peer. They can edit offline, reconnect days later, and merge. The CRDT guarantees they'll converge to the same state.
Figma uses this. Automerge uses this. Notion's new multiplayer backend uses a variant of this.
The Trade-Off Nobody Wants to Make
| Feature | OT | CRDT |
|---|---|---|
| Requires server | Yes | No |
| Offline editing | Hard | Easy |
| Undo/redo | Easy | Hard |
| Intent preservation | High | Medium |
| Latency | ~200ms (server RTT) | ~50ms (peer-to-peer) |
| Correctness guarantees | Complex (TP1, TP2 properties) | Strong (mathematical proof) |
| Memory overhead | Low | High (every character has metadata) |
Google picked OT because intent preservation mattered more than offline support. When you delete a word, you want that word gone — not some transformed, shifted version.
Figma picked CRDTs because latency and peer-to-peer mattered. In a design tool, 50ms feels instant. 200ms feels laggy.
Notion picked a hybrid: CRDTs for blocks, OT for text within blocks.
The Cursor Problem: Showing 47 People in One Document
Here's a problem neither OT nor CRDTs solve: presence.
You're in a Google Sheet with 47 collaborators. You see 47 cursors, each with a name and color. They move in real time. They're accurate to the character.
How?
Google Sheets sends cursor positions as absolute indices ("cursor at position 423") over a separate WebSocket channel. But wait — positions change when people type! If Alice inserts a character, every cursor after her needs to shift right.
So Google does this:
- The server tracks cursor positions relative to the current document state
- When an operation arrives, the server transforms all cursor positions using the same OT transform functions
- Broadcasts the updated positions
In a doc with 1,000 users, every keystroke triggers 1,000 cursor transformations.
Figma does it differently: cursors are CRDTs too. Each cursor has a unique ID and a position in the CRDT tree. When you type, your cursor moves; other cursors don't need to update unless they're affected by your edit.
It's faster. But it's also why Figma's cursor positions sometimes lag by a frame.
The Undo Paradox: Erasing What Someone Else Wrote
You type "Hello". Bob types "World" in the same cell. You press Undo.
What should happen?
- Should it delete your "Hello"? (But that leaves "World" — which wasn't what you saw)
- Should it delete everything? (But Bob didn't undo — why punish him?)
- Should it delete "Hello" and shift "World" left? (That's what OT does)
Google Sheets uses per-user undo stacks. Your Undo only affects your operations. It doesn't rewind the whole document — it removes your contributions and re-transforms everything else.
CRDTs struggle with this. Because there's no central timeline, "undoing" an operation means tombstoning it — marking it deleted but keeping it in the data structure forever. Over time, your document accumulates millions of invisible tombstones.
Yjs solves this with garbage collection — periodically compacting the CRDT and pruning tombstones. But it's tricky. Do it wrong, and you violate the convergence guarantee.
The 200ms That Changed Everything
In 2006, Google released Writely — a startup acquisition that became Google Docs. The first version didn't have real-time collaboration. You had to click "Save" and refresh.
In 2010, they added OT. The experience was surreal. You'd type a letter, and 200ms later, someone else's letter would appear in the same sentence. It felt like magic.
But it also felt slow. That 200ms round-trip to the server — the time for your operation to go up, get sequenced, and come back down — was noticeable.
In 2024, Figma runs on CRDTs with peer-to-peer sync. Latency is 50ms. Cursor movements are instant. Offline editing works seamlessly.
Google Sheets still uses OT. Because 200ms of latency is acceptable when the alternative is losing your data.
The Legacy: Every Keystroke Is a Miracle
Today, real-time collaboration is everywhere. Google Docs. Notion. Figma. Miro. Coda. Superhuman. Linear.
Every single one of these systems is powered by either OT or CRDTs — or a hybrid of both.
Google's Realtime API (the public OT library) was deprecated in 2017. ShareDB (an open-source OT library) is maintained by one developer. Yjs (the leading CRDT library) is maintained by one developer.
Billions of people type in collaborative docs every day. Most have no idea they're interacting with 30-year-old research from Xerox PARC, refined through a brutal engineering war between two competing mathematical philosophies.
You're typing in cell B4. So is Sarah.
Neither of you will crash.
And somewhere, in a Google data center in Iowa, a server is transforming your operations at 1.5 billion per day — making sure that when you both hit Enter, the formula merges perfectly.
It's not magic.
It's math.
And it's beautiful.
Keep Reading
The 200-Millisecond Symphony: How Daniel Ek Built Spotify on 2,000 Microservices While the Music Industry Called Him a Pirate
You press play. 200 milliseconds later, music floods your ears. Behind that tap lies 2,000+ microservices, a recommendation engine trained on 4 billion playlist operations, and the story of a Swedish founder who built the architecture to serve 100 million songs while paying $0.003 per stream.
The 50-Engineer Army That Beat Silicon Valley: How Jan Koum Built WhatsApp on a Telecom Language From 1986 — And Made $19 Billion Saying 'No'
In 2014, WhatsApp served 900 million users with just 50 engineers — a ratio that made Facebook's 10,000 employees look inefficient. The secret? A programming language built for telephone switches, a CEO who grew up on food stamps, and an architecture so elegant it broke every Silicon Valley rule.
The Impossible Collision: How Two People Type in the Same Cell at the Same Time — And the Algorithm War That Powers Every Google Doc
Watch two cursors race toward the same cell. Someone types 'Q3 Revenue', someone else types 'Sales Data' — and somehow, impossibly, both survive. The 40-year math problem that made multiplayer editing work.