How does Google Docs handle collision editing?

Question

How does Google Docs handle collision editing?

I was writing my own Javascript editor, with functionality similar to Google Docs (allowing several people to work on it at the same time). One thing I do not understand:

Let's say you have User A and User B connected directly to each other with a network delay of 10 ms. I assume that the editor uses a diff system (as I understand it, Documents), where editing is presented as "insert" text "at index 3", and this diffs timestamped and is enforced chronologically by all clients.

Let's start with a document containing the text: "xyz123"

User types A are "abc" at the beginning of the document at 001ms, while User B is "welcome" between "xyz" and "123" at timestamp 005ms.

Both users would expect the result to be: "abcxyzhello123", however, given the network latency:

User B will receive the User changes “insert 'abc' into index 0" at time 011ms. To maintain the chronological order, user B will cancel the installation of user B in index 3, add user A "abc" to index 0, then insert the insertion of user B into index 3, which is now between "abc" and "xyz", "thus giving" abchelloxyz123 "
User A will receive user changes "Insert" hello "at index 3" at the moment 015 ms. It would be understood that the insertion of User B was done after User A and just insert "hello" into index 3 (now between "abc" and "xyz"), giving "abchelloxyz123"

Of course, abchello xyz123 "does not match abc xyz hello 123"

In addition to literally assigning each character a unique identifier, I cannot imagine how Google could effectively solve this problem.

Some features I was thinking about:

Tracking insertion points instead of sending indexes using diffs will work, except that you will have the same problem if user B moved his 1ms entry point before editing.
You can force User B to send some information with its diff, for example, "insert after" xyz "so that user A can reasonably recognize this, but what if user A inserts the text" xyz? "
User B can recognize that this happened (when he receives the differences of user A and sees that this is a conflict), then send a fix to user B rejecting the changes and a new diff that inserts User B "hello" "abc". the length indicator is further to the right. The problem with this is the following: (1) User A will see a “transition” in the text and (2) if User A continues to edit, then User B will have to constantly correct his differences - even the “fixer” differences will be disabled and fixed, exponentially increasing complexity.
User B can send along with the diff property the property that received the last timestamped diff was -005ms or something like that, then A could admit that B was not aware of its changes (since A diff was at 001ms) and then conflicts. The problem is that (1) all user timestamps will be slightly disabled, given that most computer clocks are not accurate for ms and (2) if there is a third user C with a delay of 25 ms with user A, but with 2 ms delay with user B, and User C adds some text between "x" and "y" at -003ms, then User B will use Edit User C as a control point, but User A is not aware of editing User C (and therefore , user control point B) up to 22 ms. I believe this can be resolved if you used a shared server to timestamp all changes, but that seems pretty involved.
You can give each character a unique identifier, and then work out these identifiers instead of indexes, but this seems like overkill ...

I read http://www.waveprotocol.org/whitepapers/operational-transform , but would like to hear any and all approaches to fix this problem.

+8

google-docs editing operational-transform

Matthewsot Jun 27 '15 at 19:25

source share

1 answer

Marcel klehr · Accepted Answer · 2016-04-01T21:33:05+0000

There are various possibilities for implementing parallel replica changes depending on the scenario topology and various compromises.

Using a central server

The most common scenario is a central server with which all clients must communicate.

The server can track how each participant’s document looks. Then A and B send diff with their changes to the server. The server will then apply the changes to the corresponding tracking documents. He will then perform a three-way merger and apply the changes to the main document. He will then send the difference between the main document and the tracking documents to the relevant customers. This is called differential synchronization .

Another approach is called operation conversion (al), which is similar to rebooting in traditional version control systems. This does not require a central server, but at the same time you have it much easier if you have more than two participants (see OT FAQ ). The bottom line is that you are changing the changes in one edit, so editing assumes that changes to another editing have already occurred. For example. A converts B edit insert(3, hello) to its edit insert(0, abc) with the result insert(6, hello) .

The difference between rebasing and OT is that rebasing does not guarantee consistency if you apply the changes in different orders (for example, if B had to reinstall A edit against them the other way around, this could lead to diverging states of the document). OT's promise, on the other hand, is to allow any order if you make the right transforms.

No central server

There are OT algorithms that can work with peer-to-peer scenarios (with a compromise of increased implementation complexity at the control level and increased memory usage). Instead of a simple timestamp, a version vector can be used to track the state on which the change is based. Then (depending on the ability of your OT algorithm, in particular, conversion of property 2), incoming changes can be converted in accordance with the order in which they are received, or the version vector can be used to superimpose a partial order on editing - this should be the case history " overwritten, "destroying and transforming the changes so that they correspond to the order imposed by the version vectors.

Finally, there is a group of algorithms called CRDT, WOOT, treedoc, and logoot that try to solve the problem with specially designed data types that allow commutation of operations, so the order in which they are applied does not matter (this is similar to your idea of an ID for each character). The trade-offs here are memory consumption and overhead when building an operation.

How does Google Docs handle collision editing?

Using a central server

No central server

More articles: