A reliable way to deploy new code in a celery production cluster without interruption of service

I have several celery nodes working in production with rabbitmq, and am doing interruption-free deployment. I have to delete the whole site in order to deploy the new code to celery. I have maximum tasks for each child set to 1, so theoretically, if I make changes to an existing task, they should take effect the next time they are completed, but what about registering new tasks? I know that restarting the daemon will not kill the workers, but instead will allow them to die on their own, but it still seems dangerous. Is there an elegant solution to this problem?

+7
source share
2 answers

It seems like a difficult role is to determine which celery tasks are new and old. I would suggest creating another vhost in rabbitmq and follow these steps:

  • Update the django web servers with the new code and reconfigure to point to the new vhost.
  • While tasks are queued in the new vhost, wait while celery completes with the tasks in the old vhost.
  • When the workers complete, update the code and configuration to the new vhost

I have not really tried this, but I do not understand why this will not work. One annoying aspect is to alternate between vhosts with each deployment.

+1
source

the configuration of the MAX_TASK_PER_CHILD variable can be set for you. This variable sets the number of tasks that the working pool performs before killing itself. If the course of the new work pool is completed, this will download the new code. On my system, I usually use restarting celery, leaving another task in the background, usually everything goes well, sometimes it happens that one of these tasks is never killed, and you can still kill it with a script.

0
source

All Articles