I am curious what contraindications and trade-offs generate unique serial numbers in a distributed and parallel environment.
Imagine: I have a system where all she does is return a unique serial number every time you ask him. Here is the ideal specification for such a system (restrictions):
- Watch out for high loads.
- Allow as many concurrent connections as possible.
- Distributed: distributed loading across multiple machines.
- Productivity: Work as fast as possible with maximum throughput.
- Correctness: the generated numbers must:
- not repeated.
- be unique for each request (it should be able to break ties if any two requests occur at the same time).
- in (ascending) sequential order.
- have no spaces between requests: 1,2,3,4 ... (effectively counter for all # requests)
- Fault tolerance: if one or several or all of the machines are down, it can return to the state before the failure.
Obviously, this is an idealized specification, and not all restrictions can be fully implemented. See CAP Theorem . However, I would like to hear your analysis on the various relaxation of restrictions. What problems we will face and what algorithms we will use to solve the remaining problems. For example, if we get rid of the counter limit, the problem becomes much simpler: since valid spaces are resolved, we can simply split the number ranges and match them on different machines.
Any links (documents, books, code) are welcome. I would also like to keep a list of existing software (open source or not).
Software
- Snowflake : a network service for generating unique high-level identification numbers with some simple guarantees.
- keyspace : a publicly available unique 128-bit identifier generator whose identifiers can be used for any purpose
- RFC-4122 implementations exist in many languages . The RFC specification is probably a really good base, as it prevents the need for coordination between systems, 128-bit UUIDs, and when using software identifiers that implement certain versions of the specification, they include part of the time code that makes sorting possible, etc.
guid counter sequence distributed-computing concurrent-programming
newtonapple
source share