[Google Docs] Architecture - Rendering

Why Rendering matters in collaborative editing?

There are some solutions to render the document:

Form elements like input, textarea, etc.
DOM with fake cursors
contenteditable
Canvas rendering

1. Form elements like input, textarea, etc.

Definitely not an option, because it's can not format any text.

2. DOM with fake cursors

We can use HTML tag like <b> or <i> and styles to format the text

But it's not a good solution, because we have to calculate the cursors position by ourselves.

For example, when we

Typing Hello, the cursor should go right accordingly.
Hit Enter, the cursor should go to the next line.

All these cursor animation we have to calculate by ourselves, which is hard to implement.

3. `contenteditable`

contenteditable is a HTML attribute, which makes any element can behave like a text editor, which supports

Basic formatting like bold, italic, underline, etc.
Cursor movement
Selection
Copy and paste
Undo and redo
etc.

But there still some problems:

Limited formatting, only support what we mentioned above.
Different browsers have different implementations, which makes the behavior inconsistent.

Issue Type	Browser Differences
Enter key behavior	Some generate `<div>`, others `<p>` (Chrome vs Firefox)
Backspace deleting empty elements	Some retain empty nodes halfway, others clear them directly (Safari is especially odd)
Paste behavior	Some paste as plain text, some include formatting, some even include `<style>` tags (Edge vs Chrome differences)

Have XSS vulnerability due to content is HTML

4. Canvas rendering

The most advanced solution is to use canvas to render the document, it has the best performance, but what it takes is that implementation is even harder than DOM with fake cursors approach.

Pros	Cons
✅ Best performance, no DOM operations, no layout, no reflow, no repaint, which increase 5 ~ 45 times	❌ Need to calculate text, cursor, and more position from scratch
✅ No misaligned issue	❌ Need to implement all accessibility from scratch
✅ No cross-browser issue	❌ Need to implement all accessibility from scratch

Comparison

Solution	Form elements like input, textarea, etc.	DOM with fake cursors	🏆 `contenteditable`	Canvas rendering
Performance	🟡	🟡	🟡	✅
Text Alignment	🟡	🟡	🟡	✅
Cross-browser	❌	❌	❌	✅
Accessibility	✅	✅	✅	❌
Implementation effort	✅	❌	✅	❌
Used by	-	-	Most modern rich text editor, like Lexical ProseMirror Quill Slate Tiptap	Google Docs VSCode terminal

Conclusion : Which one to use?

If you want to build a rich text editor, 🏆 contenteditable might be your best choice for development and maintenance from scratch.

References

GreatFrontend - Rich Text Editor
Integrated Terminal Performance Improvements
Google Workspace Updates

Why Rendering matters in collaborative editing?​

1. Form elements like input, textarea, etc.​

2. DOM with fake cursors​

3. contenteditable​

4. Canvas rendering​

Comparison​

Conclusion : Which one to use?​

References​