Thread pool versus many individual threads

I am in the middle of a problem when I cannot decide which solution to take.

The problem is a bit unique. Let's look at it this way, I get data from the network continuously (2 to 4 times per second). Now each information belongs to another, say, group. Now let me name these groups, group1, group2, etc.

Each group has a dedicated job queue in which data from the network is filtered and added to the corresponding group for processing.

First, I created a dedicated thread for each group that will receive data from the job queue, process it, and then goes into the blocking state (using the associated blocking queue).

But my elder suggested that I should use thread pools, because this way the threads will not be blocked and will be used by other groups for processing.

But the fact is that getting the im data is fast enough and the time spent by the thread for processing is long enough so that the thread may not go into blocking mode. And it also ensures that the data will be processed sequentially (task 1 will be completed before task 2), which in the pool, there are very few chances, may not happen.

My elder is also inclined to believe that combining will also save us a lot of memory, because the threads are POOLED (they think he really went for the word;)). Although I do not agree with this, because, I personally think, the combined or not every thread gets its own memory stack. If there is not something in the thread pools that I donโ€™t know about.

Last, I always thought that joining helps when jobs appear in large numbers in a short time. This makes sense because a spawning thread will kill performance due to the time taken to create the thread, much more than the time taken to complete the job. So pooling helps a lot here.

But in my case, group1, group2, ..., groupN always stay alive. Therefore, if there is data or not, they will still be there. So spawning threads is not a problem.

My elder is not convinced and wants me to go with a solution pool, because his memory occupies a large place.

So which way to take?

Thanks.

+7
source share
2 answers
Good question. As you said, the pool really saves initialization time. But he has another aspect: resource management. And here I ask you about this: how many groups (read-dedicated threads) do you have? Do they grow dynamically at runtime?

For example, consider a situation where the answer to this question is yes. new group types are added dynamically. In this case, you may not want to allocate aa stream to each of them, since there are no restrictions on the number of created groups on the number of created groups, you will create many flows, and the system will switch the context instead of the actual work. The flow path to the recovery pool - the thread pool allows specify a limit on the maximum number of threads that can be created, excluding downloads. Thus, the application may refuse to service certain requests, but those that pass are handled appropriately, without critical depletion of system resources.

Given the above, I am very sure that in your case it is very good to have a thread for each group!

The same goes for your older belief that he will retain his memory. Indeed, the thread takes up memory on the heap, but is it really so if it is a predetermined sum, say 5. Even 10 is probably OK. In any case, you should not use the pool if you are not a priori and are absolutely sure that you really have a problem!

Association is a design decision, not an architectural one. You cannot unite at the beginning and begin to optimize if you find that unification will be useful after you encounter a performance problem.

Given serialization of requests (when executing an order), it doesnโ€™t matter if you use thread or a dedicated thread. Sequential execution is a property of a queue in combination with a single handler thread.

+2
source

Creating a stream will consume resources, including the default stack for the stream (IIR 512Kb, but customizable). Thus, the advantage of combining is that you carry a limited resource. Of course, you need to determine the size of your pool according to the work you have to do.

For your specific problem, I think the key is to actually measure the performance / use of threads, etc. in every scenario. If you do not run into restrictions, I probably would not have to worry anyway, except to make sure that you can change one implementation for another without significant impact on your application. Remember that premature optimization is the root of all evil. Please note that :

"Premature optimization" is a phrase used to describe a situation where a programmer takes into account performance considerations that affect the design of a piece of code. This can lead to the fact that the design is not as clean as it could be, or the code is incorrect, because the code is complicated by optimization, and the programmer is distracted by the optimizing one.

+3
source

All Articles