Skip to main content

[Google Docs] Architecture - Conflict Resolution

Why Conflict Resolution matters in collaborative editing?

Imagine Alice and Bob are editing the same document at the same time.

Original document: ABCDE

  • Alice deletes D
  • Bob deletes B

As our expected, the client and server should handle the conflict correctly to ACE.

But network latency is unpredictable, there might be 2 scenarios:

  1. Alice's operation is applied first
  2. Bob's operation is applied first

Editing Conflict

Scenario 1: Alice changes applied first

  1. Server receives Alice's operation: DEL 03, and server's state ABCDE -> ABCE, and server passed the DEL 03 to both Alice and Bob.
    1. Alice receive own operation: no change
    2. Bob receive Alice's operation: DEL 03, and Bob's state ACDE -> ACD ⚠️
  2. Server receives Bob's operation: DEL 01, and server's state ABCE -> ACE
    1. Alice receive operation: DEL 01, and Alice's state ABCE -> ACE
    2. Bob receive own operation: no change

Scenario 2: Bob changes applied first

  1. Server receives Bob's operation: DEL 01, and server's state ABCDE -> ACDE, and server passed the DEL 01 to both Alice and Bob.
    1. Alice receive Bob's operation: DEL 01, and Alice's state ABCE -> ACE
    2. Bob receive own operation: no change
  2. Server receives Alice's operation: DEL 03, and server's state ACDE -> ACD ⚠️
    1. Alice receive own operation: no change
    2. Bob receive Alice's operation: DEL 03, and Bob's state ACDE -> ACD ⚠️


The result due to Alice or Bob changes applied first is different, that's not what we want.

Therefore, we need Conflict Resolution mechanism to handle the conflict correctly.

And there are 2 main approaches to handle the conflict:

  1. Operational Transformation (OT)
  2. Conflict Resolution Data Structure (CRDT)

Who is using OT & CRDT?

TechnologyProduct / Company
OT
  • Google Docs
  • Etherpad
  • Apache Wave
  • CRDT
  • Figma
  • Yjs
  • Evernote
  • Zed


  • Operational Transformation (OT)

    Operational Transformation (OT) is a conflict resolution technique that transforms user's operations to maintain consistency while preserving user intentions.


    How it works?

    The Server utilizes OT and according to user's current state and intention, transforms operations to make all users state consistent and correct.


    Follow by the Alice's and Bob's example:

    Editing Conflict Resolution OT

    Scenario 1: Alice changes applied first

    1. Server receives Alice's operation: DEL 03, and server's state ABCDE -> ABCE, and server passed the DEL 03 to both Alice and Bob.
      1. Alice receive own operation: no change
      2. Before sending to Bob, server transforms Alice's operation to DEL 03 -> DEL 02
        Bob receive Alice's operation: DEL 02, and Bob's state ACDE -> ACE
    2. Server receives Bob's operation: DEL 01, and server's state ABCE -> ACE
      1. Alice receive operation: DEL 01, and Alice's state ABCE -> ACE
      2. Bob receive own operation: no change

    Scenario 2: Bob changes applied first

    1. Server receives Bob's operation: DEL 01, and server's state ABCDE -> ACDE, and server passed the DEL 01 to both Alice and Bob.
      1. Alice receive Bob's operation: DEL 01, and Alice's state ABCE -> ACE
      2. Bob receive own operation: no change
    2. Server receives Alice's operation: DEL 03, before updating Server's state, server transforms Alice's operation to DEL 03 -> DEL 02, and update Server's state to ACE
      1. Alice receive own operation: no change
      2. Bob receive Alice's operation: DEL 02, and Bob's state ACDE -> ACE

    Pros and Cons

    FeaturesOTCRDT
    Consistency
    Memory Usage✅ (Low)❌ (High)
    User Intent Preservation
    Implementation Effort❌ (Complex)✅ (Simple)
    Peer-to-Peer Support


    Conflict Resolution Data Structure (CRDT)

    What is CRDT?

    Conflict Resolution Data Structure (CRDT) is a data structure that handle concurrent operations automatically, and all users' states converge to the same final state.


    We'll use Yjs as our CRDT example.


    How it works?

    For Yjs, it will create an CRDT object for each operation, which includes

    Which is used to decide which operation will show first.


    To decide which operation will show first, Yjs will use the following rules:

    • If the user id is the same, the operation with the smaller timestamp will be shown first.
    • If the user id is different, the operation with the smaller user id will be shown first.


    Case 1 : 2 users with different User ID

    How Yjs works
    • Alice insert A at 0 position
    • Bob insert B at 0 position

    Result: AB

    Since Alice's user id is 0, which is smaller than Bob's user id 1, A will be shown first.



    Case 2 : 1st user with 2 characters, and 2nd user insert with 1 character at the same position

    Yjs Case 2
    • Alice (user ID : 0) insert C between A and B
    • Alice (user ID : 0) insert D between A and B
    • Bob (user ID : 1) insert E between A and B

    Result: ACDEB

    Process :

    1. Alice's user id is 0, insert C operation will be shown first,
    2. Alice's insert D operation will connect to C operation as Linked List, so it will be shown right after C
    3. Bob's insert E operation will be shown last, because it's the last operation.



    Case 3 : 3 users with different User ID and different timestamp

    Yjs Complex Scenario
    • Alice insert A at 0 position at 0ms
    • Bob insert B at 0 position at 10ms
    • Claire insert C at 0 position at 20ms

    Result: ABC

    Process :

    1. Since Alice's user id is 0, which is smaller than Bob's user id 1, A will be shown first.
    2. Bob's insert B at 0 position at 10ms, which is greater than Alice's insert A at 0 position at 0ms, B will be shown second.
    3. Finally, even Claire's user id is -1, which is smaller than Alice's & Bob's user id, but the timestamp is 20ms, which makes C's Lamport timestamp is greater than A & B's timestamp, C will be shown last.


    Scalability & Performance

    Due to solving the conflict, Yjs will create lots of CRDT objects, which will consume lots of memory.

    Therefore, Yjs have some strategies to save memory:

    1. Group characters to save Memory
    2. Delete children object when parent is marked as deleted

    1. Group characters to save Memory

    Yjs Save Memory by Collection

    If we directly copy paste the entire sentence (like ABC), it will combine together as one CRDT object.

    For example, Yjs will create ABC object with following properties:

    • userID : 0
    • timestamp : 0
    • length: 3

    which lead to ABC (0,0,3)


    But if we insert D between B and C, the ABC object will be split into 2 objects: AB and C.

    For example, Yjs will create AB object with following properties:

    • userID : 0
    • timestamp : 0
    • length: 2

    And C object with following properties:

    • userID : 0
    • timestamp : 0
    • length: 1

    AB (0,0,2) C (0,0,1)


    2. Delete children object when parent is marked as deleted

    Yjs Save Memory by Remove Children

    There is another way to save memory, which is delete children object when parent is marked as deleted.

    In Yjs, CRDT object cannot be directly deleted, because we need to keep the position to make sure other operations execute on that CRDT object can be handled correctly, but we can mark it as deleted.

    For example, if there is an paragraph, which contains ABC, XY, when we mark paragraph as deleted, the ABC and XY will be deleted.



    Previous Alice & Bob's example

    Yjs Alice & Bob Example

    Since Alice's and Bob's insert operation are at different position, for CRDT, each operation can find the correct position to insert the character.

    Which will lead to ACE



    Pros and Cons

    FeaturesRating
    Consistency
    Memory Usage❌ (High)
    User Intent Preservation
    Implementation Effort✅ (Simple)
    Peer-to-Peer Support


    Comparison

    Technical Comparison

    Operational Transformation (OT)

    FeaturesOTCRDT
    Consistency
    Memory Usage✅ (Low)❌ (High)
    User Intent Preservation
    Implementation Effort❌ (Complex)✅ (Simple)
    Peer-to-Peer Support

    Conclusion : Which one to use?

    RequirementsOTCRDT
    Quickly Iterate (like Figma)
    Scalability (low memory usage)
    Keep User Intent
    Peer-to-Peer Support


    References