Can memset parallelize on 4 cores?

I am not sure about that. Can I write a large memset (e.g. 10 MB) on four cores to get acceleration with this?

Is such a parallelization possible using ram-chip, as well as how much time it takes to start other threads - is it more than a millisecond or less?

+6
source share
1 answer

You indicate the correct question, at the same time it is difficult to give a simple answer to it. There are several aspects.

  • The overhead of launching new threads (or selecting them from some cache);
  • Contension on the memory bus.
  • The above aspects differ from each other and have different prices for different platforms.

Large PCs have several memory buses. The smaller ones have only one. In one bus memory system, this makes no sense. If your system has several memory buses (channels), your data array may have arbitrary separation between memory banks. If it happens that the entire array will be in the same memory bank, then parallelization will be useless. Finding out the structure of your array is again overhead. In other words, before splitting the operation between the kernels, it is necessary to find out whether it is worth doing or not.

The simple answer is that these difficult to predict overheads are likely to consume benefits and make the overall result worse.

At the same time, for a really huge memory area on some architectures, this makes sense.

+2
source

Source: https://habr.com/ru/post/927532/


All Articles