Marathon application migration to gracefully disable mesos-slave

Question

Marathon application migration to gracefully disable mesos-slave

I have a small Mesos cluster, and I use Marathon to manage a set of long-running services with a variable number of instances each.

I would like to be able to launch new nodes or interrupt some of them in accordance with the needs of the business. However, at the conclusion of the node, I realized that there was a potential problem: when I close the Mesos slave, it happens that the number of instances of some services temporarily falls below a certain minimumHealthCapacity . This can lead to some downtime if, for example, a stopped computer starts a service with only one instance.

Consider the following simplified scenario: node 1 runs service A, node 2 starts service B, and node 3 starts service C. minimumHealthCapacity for all services: 1. I want to shut down node 1 and leave only 2 and 3. I do not want any downtime in to service A. An example of the expected behavior would be to scale service A to 2, and then safely terminate node 1.

What can I do to make sure that the service does not fall below minimumHealthCapacity ?

Ideally, I would have an update process with an updated update for this - replacements start on separate machines, followed by a cessation of service on the machine, which should be turned off. I would like to have at least an automatic process for this, so scaling down is a simple script. I have no requirements for the amount of time it takes for this, i.e. I can disable the slave Mesos only after I am sure that the migration of the marathon is completed and successful.

+8

mesos marathon

Rui gonçalves Jul 31 '15 at 11:13

source share

1 answer

Adam · Accepted Answer · 2015-08-04T03:34:11+0000

Currently, the Mesos dev team is working on “Service Primitives,” so the operator can indicate that a particular machine plans to go down at a specific time (or ASAP), triggering messages to each structure, notifying them of the alleged inaccessibility of the window. Thus, a platform like Marathon can decide to migrate its tasks from this node so that it can safely shut down without any downtime.

See https://issues.apache.org/jira/browse/MESOS-1474 for more details.

Marathon application migration to gracefully disable mesos-slave

More articles: