The Cursor That Shouldn't Work: How Google Sheets Lets Two People Type in the Same Cell Without Losing a Single Keystroke
๐Ÿ—๏ธSystem DesignApril 20, 2026 at 8:29 AMยท11 min read

The Cursor That Shouldn't Work: How Google Sheets Lets Two People Type in the Same Cell Without Losing a Single Keystroke

You're editing cell B4. Your colleague is editing cell B4. You both hit 'enter' at the exact same millisecond. Neither of you loses a character. How is that even possible?

System DesignDistributed SystemsOTCRDTsGoogle SheetsReal-Time CollaborationAlgorithmsInfrastructure

The Impossible Moment

It's 2:47 PM on a Thursday. You're in a Google Sheet with your team, finalizing Q4 projections. You click into cell B4 and start typing "Revenue: $1.2M". At that exact moment โ€” literally the same millisecond โ€” your colleague in London clicks into the same cell and types "Revenue: $1.5M".

You both hit enter.

No error message. No "conflict detected" dialog. No lost work. The cell now reads "Revenue: $1.5M" โ€” your colleague's later estimate. Your cursor moves down to B5. Their cursor moves to C4. Both of you keep working.

You just witnessed something that should be impossible.

Two people edited the same piece of data at the same time, halfway around the world from each other, and the system knew exactly what to do. No locks. No overwrites. No data loss. Just... magic.

Except it's not magic. It's one of the most elegant algorithmic achievements in distributed systems โ€” a 30-year evolution from Xerox PARC research labs to the collaborative infrastructure powering billions of keystrokes across Google Docs, Sheets, Figma, Notion, and every multiplayer editor you've ever used.

This is the story of how we solved real-time collaboration. And it starts with a problem that stumped computer scientists for decades.

The Transform Problem

Rewind to 1989. Xerox PARC โ€” the same lab that invented the GUI, the mouse, and Ethernet โ€” is working on something called the Jupiter collaboration system. The goal: let two people edit the same document simultaneously over a network.

The problem seems simple at first. When User A types a character, send it to User B. When User B types a character, send it to User A. Done, right?

Wrong.

Imagine a document that says "cat". User A (position 0) types "s" to make "scat". User B (position 3) types "s" to make "cats". Both operations happen at the same time, before either user has seen the other's change.

User A's operation: Insert('s', position=0) User B's operation: Insert('s', position=3)

Now both operations arrive at the server. If you just apply them in order:

  1. Apply A's op: "cat" โ†’ "scat"
  2. Apply B's op: "scat" โ†’ "scats" (position 3 in the NEW string)

Perfect! Now send both ops to both users.

But here's the nightmare: User A already applied their own operation locally (for instant feedback). Now they receive B's operation: Insert('s', position=3). But their document already says "scat". Position 3 in "scat" is between 'a' and 't'. Apply it there and you get "scats".

Meanwhile, User B applied their operation first: "cat" โ†’ "cats". Now they receive A's operation: Insert('s', position=0). Apply it to "cats" and you get "scats".

Both got "scats"? Great! Except... that's only because we got lucky. Try the same logic with delete operations and you get divergence. One user sees "scat", the other sees "cats". The document is now forked. The system is broken.

This is the operational transformation problem. And it nearly killed the dream of real-time collaboration.

The Jupiter Solution

The PARC researchers โ€” David Nichols, Pavel Curtis, Michael Dixon, and John Lamping โ€” had an insight that changed everything. The problem wasn't the operations themselves. It was that operations were written for one document state but applied to another.

Their solution: transform operations based on concurrent changes.

When User B's Insert('s', position=3) arrives at User A (who already inserted at position 0), don't just blindly apply it. Transform it first. The original operation was at position 3 in the old document. But User A inserted a character before position 3. So transform the position: 3 + 1 = 4.

Transformed operation: Insert('s', position=4) Apply to "scat": "scats"

User B does the opposite transform. A's Insert('s', position=0) doesn't need adjustment because it's before B's edit.

Both users converge on "scats".

This is Operational Transformation (OT). And it's the algorithm that powers Google Docs and Sheets to this day.

The Transform Function

The heart of OT is the transform function. It takes two concurrent operations and produces two new operations that achieve the same intended effect, even when applied in different orders.

transform(op1, op2) โ†’ (op1', op2')

For insert operations:

  • If op1 inserts before op2's position, increase op2's position by 1
  • If op2 inserts before op1's position, increase op1's position by 1
  • If both insert at the same position, use a tiebreaker (e.g., user ID)

For delete operations:

  • If op1 deletes at op2's insert position, shift op2 left
  • If op1 deletes before op2, shift op2 left by 1
  • If both delete the same character, transform one to a no-op

For insert vs. delete:

  • If delete happens before insert, shift insert left
  • If insert happens before delete, shift delete right

Seems straightforward? Here's the catch: you need transform functions for every possible operation type (insert, delete, format, move, resize) crossed with every other type. For a rich editor like Google Docs, that's hundreds of transform functions.

And every single one must satisfy two mathematical properties:

TP1 (Convergence): op1 ยท transform(op2, op1) = op2 ยท transform(op1, op2)

TP2 (Causality): transform(transform(op1, op2), transform(op3, op2)) = transform(transform(op1, op3), transform(op2, op3))

Get one transform function wrong, and two users editing a document can diverge. Permanently. No error message. Just silent data corruption.

Google's infrastructure team reportedly spent years debugging edge cases in their OT implementation. Undo/redo in a collaborative context? Every undo operation must be transformed against all intervening operations from other users. Intent preservation? If you delete "the cat" and someone else concurrently inserts "big " before "cat", should your delete remove "the big cat" (position-based) or just "the cat" (intent-based)?

OT is powerful. But it's also a minefield.

The Central Server Requirement

Here's OT's dirty secret: it requires a central server to establish operation order.

In the Jupiter system, all operations go through a central server. The server maintains a global operation order. When concurrent ops arrive, the server transforms them into that order and broadcasts the transformed ops to all clients.

Clients maintain their own transform queues, but the server is the source of truth.

This works brilliantly for Google Sheets. Low latency to Google's data centers. Reliable websocket connections. Central authority to resolve conflicts.

But what if you want offline editing? What if you want peer-to-peer collaboration without a server? What if you want local-first software that works when the internet dies?

OT can't do that. Because OT needs a central server to establish the canonical operation order.

Enter CRDTs.

The CRDT Revolution

In 2011, Marc Shapiro and his team at INRIA published a paper on Conflict-free Replicated Data Types. The idea: design data structures that guarantee convergence without any central coordination.

No transform functions. No server. No operation ordering. Just pure math.

A CRDT is a data structure where:

  1. Every replica can be updated independently
  2. Replicas can be merged without conflicts
  3. All replicas eventually converge to the same state

The simplest example: a G-Counter (grow-only counter). Each user has their own counter. To increment, you increment your own counter. To get the total, sum all counters. No matter what order updates arrive, everyone converges on the same total.

For text editing, CRDTs use a more sophisticated structure: sequence CRDTs.

The Figma Breakthrough

The most famous CRDT implementation is Figma's multiplayer engine, built by Evan Wallace in 2016.

Figma faced a problem: designers collaborating on the same canvas, with complex nested layers, transforms, and styling. OT's transform functions would be a nightmare for this data model.

Wallace built a custom CRDT using CRDTs for different data types:

  • LWW-Register (Last-Write-Wins) for properties like color, opacity, font size
  • 2P-Set (two-phase set) for layer lists
  • Sequence CRDT for text within text layers
  • Version vectors for causality tracking

The magic: LWW-Registers use timestamps plus a replica ID as a tiebreaker. If two users set a rectangle's color at the same time, the one with the higher replica ID wins. Arbitrary, but deterministic. Everyone converges.

For layer ordering, Figma uses fractional indexing. Instead of positions 0, 1, 2, layers have positions like 0.5, 1.0, 1.5. Insert between two layers? Generate a fraction between their positions (e.g., 0.75). Run out of precision? Rebalance the tree. This enables conflict-free insertion without coordination.

The result: Figma's multiplayer feels instant because it is. No server round-trip for transforms. Just local updates, CRDT merges, and eventual consistency.

The Sequence CRDT Problem

Text is harder. You can't just use LWW-Registers for each character because order matters. Delete character 5, and what is "character 5" shifts.

Sequence CRDTs assign each character a unique, immutable ID. Insert between two characters? Generate an ID between their IDs (using fractional indexing or UUID-based schemes).

The most popular implementations:

RGA (Replicated Growable Array): Each character has a unique timestamp + replica ID. Deletions are tombstones. To insert after character X, create a new character with a timestamp greater than X. Concurrent insertions are ordered by timestamp + replica ID.

YATA (Yjs Algorithm): Used by Yjs, the most popular CRDT library for JavaScript. Each character has an ID based on (client ID, clock). Insertions include both a left origin (the character you're inserting after) and a right origin (the character you're inserting before, if any). This enables better intent preservation than RGA.

Automerge: Represents text as a tree of operations. Each insertion creates a new node. Concurrent insertions are ordered using a deterministic merge algorithm.

All guarantee eventual consistency. All are more complex than OT. All have a hidden cost: metadata.

Every character in a CRDT-based editor carries an ID (typically 10-20 bytes). For a 100KB text document, you might have 1MB of metadata. Figma deals with this by garbage-collecting metadata in large documents. Yjs uses aggressive compression. Automerge has explored columnar storage.

OT vs. CRDT: The Trade-offs

Complexity:

  • OT: Complex transform functions, but compact data representation
  • CRDT: Simple merge logic, but heavy metadata overhead

Latency:

  • OT: Requires server round-trip for authoritative order (50-200ms)
  • CRDT: Instant local updates, no server needed (0ms)

Offline Support:

  • OT: Poor. Clients must buffer operations until reconnected, then replay through server
  • CRDT: Excellent. Work offline for days, merge when back online

Correctness Guarantees:

  • OT: Convergence depends on correct transform functions (hard to prove)
  • CRDT: Convergence guaranteed by math (provably correct)

Undo/Redo:

  • OT: Hard. Undo must transform against concurrent ops from other users
  • CRDT: Even harder. Undoing an operation in a CRDT typically requires undoing the effect, not the operation itself

Real-World Use:

  • OT: Google Docs, Google Sheets, Office 365, Dropbox Paper
  • CRDT: Figma, Linear, Notion (hybrid), Tldraw, Jupyter notebooks (via Yjs)

The Cursor Presence Problem

Here's a detail both OT and CRDT systems must solve: showing everyone's cursor position in real-time.

Sounds simple. Broadcast cursor position on every keystroke. But cursor positions are expressed as character indices. When someone else inserts a character before your cursor, your cursor position shifts.

OT systems transform cursor positions the same way they transform operations. Your cursor at position 10 gets transformed to position 11 when someone inserts at position 5.

CRDT systems anchor cursors to character IDs instead of positions. Your cursor is "after character with ID abc123", not "at position 10". When characters are inserted or deleted, the cursor stays anchored to the same character.

Either way, you need to update cursor positions on every remote edit. At 60 FPS, across 50 simultaneous editors, that's 3,000 position updates per second.

Google Sheets optimizes by batching cursor updates. Figma sends cursor positions over a separate UDP channel for lower latency (TCP would wait for packet acknowledgment).

The Legacy

Today, when you open a Google Sheet and see someone else's cursor moving in real-time, you're watching the culmination of 30 years of distributed systems research.

From Jupiter's original OT algorithm in 1989, to Google's acquisition of AppJet (EtherPad) in 2009 for its OT implementation, to Figma's CRDT breakthrough in 2016, to Yjs becoming the de facto standard for CRDT-based collaboration in 2020.

Every keystroke you make in a collaborative document is transformed, merged, and synchronized through algorithms that guarantee something impossible: that two people, editing the same data at the same time, will always converge on the same result.

No locks. No overwrites. No lost work.

Just math. Beautiful, elegant, brain-melting math.

The next time you type in a shared Google Sheet while your colleague types at the same moment, remember: you're not just editing a spreadsheet. You're executing a distributed consensus algorithm that would make a computer science PhD cry.

And it works so well, you don't even notice.

โœ๏ธ
Written by Swayam Mohanty
Untold stories behind the tech giants, legendary moments, and the code that changed the world.

Keep Reading