CPU-intensive flow wisdom

I want to run a batch, for example, 20 compilers with an intensive processor (mostly very long nested for a loop) on the machine.

Each of these 20 jobs does not transmit data to the other 19.

If a machine has N cores, should I unscrew N-1 of these tasks? Or N? Or should I just run all 20, and Windows figure out how to plan them?

+4
source share
3 answers

Unfortunately, there is no simple answer. The only way to know for sure is to implement and then profile your application.

Generally, for maximum throughput, if the jobs are a pure processor, you need one per core. Depending on the type of work, this will include one per hypertext code or only one "true physical core". (If the work is identical for all 20 tasks, then hyper-threading often slows down the overall work ...)

If tasks have any non-processor functionality (for example, reading a file, waiting for something, etc.), then> 1 work item per core tends to be much better. For many situations, this will improve.

+5
source

Generally, if you are not using data, rather than blocking IO and using a lot of CPU, and nothing else works on the box (and maybe a few more warnings) using the entire processor (e.g. N threads) is probably the best idea.

The best choice is probably to tweak it and profile it and see what happens.

+3
source

You should use some type of thread pool, so it’s (reasonably) easy to set the number of threads without affecting the structure of the program.

After you have done this, it is simple enough to conduct testing to find a sufficiently optimal number of threads compared to the number of available processors. Most likely, even if / if they look like this, it should be purely processor-bound, you will get better efficiency with the number of threads> N, but about one single way to make sure that you need to test.

+2
source

All Articles