Distributed Systems: Maintaining Time Series Consistency Between Different Nodes

Context

We have a distributed system. We extract events from one of those systems that are read from another system to generate reports.

The logical order is ensured by the fact that even if the emitter system has N nodes, there is an underlined finite state machine, which makes it impossible to simultaneously emit events for one unit. These events are timestamped. N nodes may not always be time synchronized.

We take great care of the timestamp, because the downstream system that generates reports always needs a timestamp, because People Reporting takes care of such data to verify that everything is going the right way.

Problem

The fact that the two nodes may have a slight discrepancy makes us think. Imagine the following example.

The logical order of events is as follows:

Event 1 => Event 2 => Event 3

But in the database, we could have this situation:

-------------------------------------------
|  Name   |  TimeStamp  |  Logical Order  |
-------------------------------------------
| Event 1 |      2      |        1        |
| Event 2 |      1      |        2        |
| Event 3 |      3      |        3        |
-------------------------------------------

As you can see, event 2 logically happened after event 1, but their timestamp cannot be synchronized.

Well, this does not happen every 2 seconds, but it can happen because the timestamp comes from different nodes. And in terms of reporting, this is an anomaly.

Possible solutions

  • . (NTP ), , , , , " ".
  • , , , . , .

?

+4
2

, " -" . , -.

, , , .

, , .

+2

, , , . , , logical order. .

, snowflake, . , . , , : , .

TL; DR

, , logical order, , , .

+1

All Articles