Consider an MPI application based on two steps, which we will call load and globalReduce. Just for simplicity, the software is described as such, but much more is happening, so this is not just a Map / Reduce problem.
During the boot phase, all ranks in each given node are queued, so that one and only one rank has full access to all node memory. The reason for this design is due to the fact that during the boot phase there is a set of large I / O blocks that are read, and all of them must be loaded into memory before local reduction occurs. The result of this local reduction will be called the named variable myRankVector. After receiving the variable myRankVector, the IO blocks are freed. The myRankVector variable itself uses a small amount of memory, therefore, during its creation, the node can use all the memory, after the rank is completed, it takes only 2-3 GB to store myRankVector.
During the globalReduce stage in node, it is expected that all ranks in node have loaded their respective globalReduce.
So, here is my problem, while I guaranteed that there are no memory leaks (I program using shared pointers, I double-checked Valgrind, etc.), I am sure that the heap remains extended even after all destructors have issued blocks input-output When the next rank in the queue comes to work, it starts to request more memory, as the previous rank did, and, of course, the program kills Linux, yielding "Out of memory: Kill process xxx (xxxxxxxx) score xxxx or sacrifice a child". It is clear why this is so, the second rank in the queue wants to use all the memory, but the first rank remains with a large pile.
So, after setting up the context of this question: is there a way to manually reduce the heap size in C ++ in order to really free memory that is not in use?
Thanks.
c ++ heap memory mpi
Jose Garcia
source share