Implement master-slave

Powered by Ubuntu. The program is in C ++. I have 2 processes running on different hosts when one is a master and one is a slave (they have no priority, only one processing request). Only one process can be a request from the master and processing. The two processes are always up, and if they fail, there is a watch dog that restarts them.

Hosts are connected by network cable.

My plan is to ask to save life from one to the other, and if the slave ceases to hold on to the master, he needs to change his condition to a master. When the wizard starts again, it first waits to save life, and in case it is not installed as the role of the wizard. if you get his role as a subordinate.

I would be glad to receive your opinion on:

How to prevent the simultaneous use of both masters? This is my main problem. When starting up and connectingvity crashes, how do you prevent 2 wizards at the same time?

Do you think it would be better to request to continue to live or send live? (in my opinion, it is better to ask him to save his life than to push)

Any other helpful tips and pitfalls would be more than welcome.

+6
source share
2 answers

The way I did this is for each process to spawn a heartbeat thread that sends a UDP packet once per second and listens for incoming UDP packets from another process. If the heartbeat stream does not receive any UDP packets from another process for a certain period of time (for example, 5 seconds), it assumes that the other process is not working and notifies the parent stream that it should now be a master.

The reason that transmitting / listening to heartbeat is performed in a dedicated stream is because this way, if the main stream is busy with long computation, it will not temporarily disable UDP packets. Thus, the algorithms in the main thread should not be in real time to avoid false failures.

There is another problem here: what happens if a network problem temporarily disconnects the connection between the two hosts? (for example, some joker or QA tester disconnects the Ethernet cable for 1 minute and then reconnects it). In this case, both processes will stop receiving UDP packets from another process, so both processes will assume that the other process has disappeared and both will become the main process. Then, when the network cable is reconnected, you simultaneously start two master processes, which you do not want. Thus, you need to somehow perform two master processes in order to decide which one should lower to the status of a subordinate in order to satisfy the Highlanders principle ("there can only be one!"). It can be as simple as β€œthe host with the smallest IP address must remain the master”, or you can have every heartbeat packet containing uptime, and the host with the longest uptime should remain the master, etc.

+4
source

A typical way to solve this problem is to hold elections. Everyone in the system shares the data that they will use as input to the algorithm so that everyone can come to the same conclusion.

For example: peers (both) send each other a unique identifier (MAC address or pid or high-precision process start time, for example). Then each peer uses the same comparison to determine the winner (highest value, for example). They then communicate with each other the results.

For troubleshooting transitional issues see Byzantine generals .

See also:

+1
source

Source: https://habr.com/ru/post/926001/


All Articles