Celery has significant implications for many tables

Question

Celery has significant implications for many tables

Are there significant work implications that I should keep in mind when celery workers are pulled from several (or perhaps many) lines? For example, would there be a significant decrease in productivity if my system were designed so that workers pulled from 10 to 15 queues, and not just 1 or 2? To continue, what if some of these lines are sometimes empty?

+5

python celery

Fmc May 10, '16 at 0:46

source share

1 answer

Mauro rocco · Accepted Answer · 2016-05-10T07:36:52+0000

A short answer to your question about queue limits:

Do not worry that several queues will not be worse or better, brokers are designed to handle a huge number of them. Of course, in many cases of use you do not need so much, except really advanced. Empty queues do not cause any problems, they just take a tiny amount of memory from the broker.

Do not forget that you have other things, such as exchanges and bindings, and there you have no real restrictions, but it’s better to understand the consequences for each of them before using (the TEMIK exchange will use more central processor than, for example, direct )

To give you a more complete answer, consider the topic of performance from a more general point of view.

When viewing a distributed messaging-based system, such as Celery, there are two main topics to analyze in terms of performance:

Number of workers and concurrency coefficient.
As you probably already know that every celery worker has a concurrency parameter that determines how many tasks can be performed simultaneously, this should be set depending on the server capacity (CPU, RAM, I / O) and also depends on the type of tasks. which will be performed by a particular consumer (depends on the queue that he will consume).
To disable the course, depending on the total number of tasks that need to be completed in a certain time window, you will need to decide how many workers / servers you will need to run.
Broker, Single point of failure in this style of architecture.
The broker, especially RabbitMQ, is designed to manage millions of messages without any problems, however, more messages that it will need to store more memory will be used, and more messages to route more processor that it will use.
This machine should also be well tuned and, if possible, in high availability .
Of course, the main thing is to avoid messages consumed at a lower rate than they are produced, otherwise your turn will continue to grow, and your RabbitMQ will explode. Here you can find some tips .

There are times when you may also need to increase the number of tasks performed in a certain period of time, but only in response to peak requests. The best part about this architecture is that you can control the size of the queues, and when you understand it is growing fast, you can create new machines on the fly with the celery worker already configured, and not turn it off when they are not needed. This is a fairly economical and efficient approach.

One hint, do not forget to save the results of celery in RabbitMQ.

Celery has significant implications for many tables

More articles: