Some program I'm working on now consumes a lot more memory than I think. So I'm trying to understand how glibc malloc cropping works. I wrote the following test:
#include <malloc.h> #include <unistd.h> #define NUM_CHUNKS 1000000 #define CHUNCK_SIZE 100 int main() { // disable fast bins mallopt(M_MXFAST, 0); void** array = (void**)malloc(sizeof(void*) * NUM_CHUNKS); // allocating memory for(unsigned int i = 0; i < NUM_CHUNKS; i++) { array[i] = malloc(CHUNCK_SIZE); } // releasing memory ALMOST all memory for(unsigned int i = 0; i < NUM_CHUNKS - 1 ; i++) { free(array[i]); } // when enabled memory consumption reduces //int ret = malloc_trim(0); //printf("ret=%d\n", ret); malloc_stats(); sleep(100000); }
Test output (without calling malloc_trim):
Arena 0: system bytes = 112054272 in use bytes = 112 Total (incl. mmap): system bytes = 120057856 in use bytes = 8003696 max mmap regions = 1 max mmap bytes = 8003584
Although almost all of the memory has been released, this test code consumes much more resident memory than expected:
[ root@node0-b3 ]
Processes:
0245e000-08f3b000 rw-p 00000000 00:00 0 [heap] Size: 109428 kB Rss: 109376 kB Pss: 109376 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 109376 kB Referenced: 109376 kB Anonymous: 109376 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me ac 7f1c60720000-7f1c60ec2000 rw-p 00000000 00:00 0 Size: 7816 kB Rss: 7816 kB Pss: 7816 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 7816 kB Referenced: 7816 kB Anonymous: 7816 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB
When I turn on the malloc_trim call, the test result remains almost the same:
ret=1 Arena 0: system bytes = 112001024 in use bytes = 112 Total (incl. mmap): system bytes = 120004608 in use bytes = 8003696 max mmap regions = 1 max mmap bytes = 8003584
However, RSS is significantly reduced:
[ root@node0-b3 ]
Process Processing (after malloc_trim):
01698000-08168000 rw-p 00000000 00:00 0 [heap] Size: 109376 kB Rss: 8 kB Pss: 8 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 8 kB Referenced: 8 kB Anonymous: 8 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me ac 7f508122a000-7f50819cc000 rw-p 00000000 00:00 0 Size: 7816 kB Rss: 7816 kB Pss: 7816 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 7816 kB Referenced: 7816 kB Anonymous: 7816 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB
After calling malloc_trim, the heap is clamped. I assume that the 8MB mmap segment is still available due to the last piece of memory that has not been released.
Why is heap tuning not performed automatically by malloc? Is there a way to configure malloc so that cropping is done automatically (when it could save most of the memory)?
I am using glibc version 2.17.