What is CRDT in distributed systems?

I am new to distributed systems and am trying to get an idea of ​​the concept of CRDT. I understand that it has three entries:

Conflict-free Replicated Data Type Convergent Replicated Data Type Commutative Replicated Data Type 

Can someone give an example of using CRDT in distributed systems? Thanks a lot in advance.

+8
distributed-computing distributed-system crdt
source share
3 answers

CRDT is inspired by the work of Mark Shapiro. In distributed computing, a type of conflict-free replicated type (abbreviated CRDT) is a type of specially designed data structure used to achieve strong finite consistency (SEC) and monotony (no rollbacks). There are two alternative ways of securing the SEC: work-based CRDTs and government CRDTs.

CRDTs on different replicas can diverge from each other, but in the end they can be safely combined, providing ultimately a consistent value. In other words, CRDTs have a merge method that is idempotent, commutative, and associative.

Two alternatives are equivalent, since you can emulate the other, but based on operations CRDT, additional warranties are required from the middleware communications. CRDTs are used to replicate data on multiple computers on a network that perform updates without the need for remote synchronization. This will lead to a merger of conflicts in systems using traditional consistent consistency technology, but CRDTs are designed in such a way that conflicts are mathematically impossible. Under the limitations of the CAP theorem, they provide the strongest guarantees of consistency of available / shared volume (AP) settings.

Some examples in which they are used

Riak is CRDT's most popular open source library and is used by Bet365 and League of Legends. Below are some useful links that support Riak.

1- Bet365 (uses Erlang and Riak) http://www.erlang-factory.com/static/upload/media/1434558446558020erlanguserconference2015bet365michaelowen.pdf

2. Legends League uses the CRADT Riak implementation for its game chat system (which processes 7.5 million concurrent users and 11,000 messages per second)

3- Roshi, implemented by SoundCloud, which supports LWW with a timestamp Set: -Blog post: https://developers.soundcloud.com/blog/roshi-a-crdt-system-for-timestamped-events

+9
source share

Those three acronym extensions mean basically the same thing.

CRDT converges if the same operations applied in a different sequence produce (converge) to the same result. That is, operations can be switched - this is a commutative RDT. The reason that operations can be applied in a different sequence and still get the same result is because the operations do not contain conflicts.

Thus, CRDT means the same thing, regardless of which of the three extensions you use, although I personally prefer convergent.

+1
source share

CRDTs use Math to ensure consistency in a distributed cluster, without worrying about consensus and the associated latency / inaccessibility.

The set of values ​​that CRDT can take at any time belongs to the category of half-lattices (in particular, connection half-lattices), which is POSET (partially ordered set) with the smallest upper limit function (LUB).

Simply put, POSET is a collection of elements in which not all are comparable. For example. in the array of pairs: {(2,4), (4, 5), (2, 1), (6, 3)} , (2,4) is equal to < (4,5) , but cannot be compared with (6,3) (since one element is larger and the other is smaller). Now the semilattice is a POSET in which 2 pairs are given, even if you cannot compare them, you can find an element that is larger than both.

Another condition is that updates of this data type must increase, CRDTs have a monotonously increasing state when clients never see a rollback of the state.

This great article uses an array, which I used above as an example. For a CRDT that supports these values, if 2 replicas try to reach a consensus between (4,5) and (6,3) , they can choose LUB = (6,5) as consensus and assign it as replicas. As the values ​​increase, this is a good value to calculate.

There are two ways to synchronize CRDT between replicas, they can transmit state through periodically (converged type of replicated data) or can transmit updates (delta) as they are received (switched type of replicated data), the first one uses a lot of bandwidth.

SoundCloud Roshi is a good example (although apparently it no longer works), they store data associated with a timestamp, where the timestamp is obviously increasing. Any updates that are in the timestamp less than or equal to the one stored are discarded, which guarantees idempotency (repeated entries in order) and commutativity (due to the lack of entries in order. Commutativity - a=b means b=a , which in in this case means update1 followed by update2 matches update2 and then update1)

The scripts are sent to all clusters, and if certain nodes cannot respond due to a problem, such as slowness or partition, they are expected to be reached later through read-repair , which guarantees convergence of values. Convergence can be achieved using two protocols, as I mentioned above, state propagation or updates to other replicas. I believe that Roshi is doing the first. As part of read-repair , the state of exchange of replicas, and because the data corresponds to the semi-lattice property, they converge.

PS. Systems using CRDTs are ultimately consistent, meaning they accept APs (highly accessible and partition resistant) in the CAP theorem .

Another great review on the topic.

+1
source share

All Articles