[Google Docs] Data Model
Why we need editor state model? Not just directly use DOM?
There are some reasons why we need a data model:
- ❌ Not able to support undo/redo
- ❌ Hard to handle custom elements
- ❌ DOM is too large
- ❌ DOM is often messed by browser extensions
Apparently, we need a data model to store the document state efficiently.
Data Model
To store data model, there are 2 ways to store the document:
For example, we have a document like this:
<root>
<h1>Benson's doc</h1>
<p>Hello <strong>World</strong></p>
</root>
Simple version: Tree structure
The most simple way to store document is to imitate the DOM tree structure, but in JSON format.

And here is the representation in code:
{
type: "root",
children: [
{
type: "h1",
children: [
{ type: "text", text: "Benson's doc" },
]
},
{
type: "paragraph",
children: [
{ type: "text", text: "Hello " },
{ type: "text", text: "World", format: 'bold' }
]
}
]
}
Better version: Map with pointer children
The most performant way to store document is to use a map with pointer children because :
- we can get and set the children in O(1) time.
- we still can make children point to other nodes, to keep the tree relationship.

And here is the representation in code:
const nodeMap = {
Root: {
type: "root",
children: ["1", "3"]
},
"1": {
type: "h1",
children: ["2"]
},
"2": {
type: "text",
text: "Benson's doc"
},
"3": {
type: "paragraph",
children: ["4", "5"]
},
"4": {
type: "text",
text: "Hello "
},
"5": {
type: "text",
text: "World",
format: "bold"
}
}
Advantages:
- ✅ Single layer map, easy to clone, same for shallow and deep clone.
- ✅ O(1) access if know
NodeKey
Disadvantages:
- ❌ O(n) if children array is inserted new node or removed node.
Performant version: Map with Linked List children
Because inserted new node or removing node is still not ideal, we can use a linked list to store the children.

And here is the representation in code:
interface EditorNode {
key: NodeKey | null,
parent: NodeKey | null,
prev: NodeKey | null,
next: NodeKey | null,
}
interface ElementNode extends EditorNode {
firstChild: NodeKey | null,
lastChild: NodeKey | null,
}
interface TextNode extends EditorNode {
text: string,
}
With the linked list node, we have the node map like this:

And the tree will be like this:

Advantages:
- ✅ Single layer map, easy to clone, same for shallow and deep clone.
- ✅ O(1) access if know
NodeKey
- ✅ O(1) if children array is inserted new node or removed node.
Disadvantages:
- ❌ Data structure is more complex, harder to understand.
Comparison
Performance | Tree structure | Map with pointer children | Map with linked list children |
---|---|---|---|
Clone shallow | ❌ | ✅ | ✅ |
Access | O(n) | O(m) | O(1) |
Insert/remove | O(n) | O(1) | O(1) |
Complexity | 🟢 | 🟡 | 🔴 |
- n : all nodes length
- m : children array length
For performance reason, we should use the Map with linked list children.