Understanding glibc malloc trim

Some program I'm working on now consumes a lot more memory than I think. So I'm trying to understand how glibc malloc cropping works. I wrote the following test:

#include <malloc.h> #include <unistd.h> #define NUM_CHUNKS 1000000 #define CHUNCK_SIZE 100 int main() { // disable fast bins mallopt(M_MXFAST, 0); void** array = (void**)malloc(sizeof(void*) * NUM_CHUNKS); // allocating memory for(unsigned int i = 0; i < NUM_CHUNKS; i++) { array[i] = malloc(CHUNCK_SIZE); } // releasing memory ALMOST all memory for(unsigned int i = 0; i < NUM_CHUNKS - 1 ; i++) { free(array[i]); } // when enabled memory consumption reduces //int ret = malloc_trim(0); //printf("ret=%d\n", ret); malloc_stats(); sleep(100000); } 

Test output (without calling malloc_trim):

 Arena 0: system bytes = 112054272 in use bytes = 112 Total (incl. mmap): system bytes = 120057856 in use bytes = 8003696 max mmap regions = 1 max mmap bytes = 8003584 

Although almost all of the memory has been released, this test code consumes much more resident memory than expected:

 [ root@node0-b3 ]# ps aux | grep test root 14662 1.8 0.4 129736 **118024** pts/10 S 20:19 0:00 ./test 

Processes:

 0245e000-08f3b000 rw-p 00000000 00:00 0 [heap] Size: 109428 kB Rss: 109376 kB Pss: 109376 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 109376 kB Referenced: 109376 kB Anonymous: 109376 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me ac 7f1c60720000-7f1c60ec2000 rw-p 00000000 00:00 0 Size: 7816 kB Rss: 7816 kB Pss: 7816 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 7816 kB Referenced: 7816 kB Anonymous: 7816 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB 

When I turn on the malloc_trim call, the test result remains almost the same:

 ret=1 Arena 0: system bytes = 112001024 in use bytes = 112 Total (incl. mmap): system bytes = 120004608 in use bytes = 8003696 max mmap regions = 1 max mmap bytes = 8003584 

However, RSS is significantly reduced:

 [ root@node0-b3 ]# ps aux | grep test root 15733 0.6 0.0 129688 **8804** pts/10 S 20:20 0:00 ./test 

Process Processing (after malloc_trim):

 01698000-08168000 rw-p 00000000 00:00 0 [heap] Size: 109376 kB Rss: 8 kB Pss: 8 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 8 kB Referenced: 8 kB Anonymous: 8 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me ac 7f508122a000-7f50819cc000 rw-p 00000000 00:00 0 Size: 7816 kB Rss: 7816 kB Pss: 7816 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 7816 kB Referenced: 7816 kB Anonymous: 7816 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB 

After calling malloc_trim, the heap is clamped. I assume that the 8MB mmap segment is still available due to the last piece of memory that has not been released.

Why is heap tuning not performed automatically by malloc? Is there a way to configure malloc so that cropping is done automatically (when it could save most of the memory)?

I am using glibc version 2.17.

+5
source share
1 answer

Largely for historical reasons, memory for small allocations comes from a pool controlled by the brk system call. This is a very old system call - at least like the old Unix Version 6 - and the only thing it can do is resize the arena whose memory position is fixed. This means that the brk pool cannot be compressed after a block that is still allocated.

Your program allocates N memory blocks and then frees N-1 of them. The only block that he does not free is the one at the highest address. This is the worst case scenario for brk : size cannot be reduced at all, although 99.99% of the pool is not used! If you change your program so that the block that it did not release was array[0] instead of array[NUM_CHUNKS-1] , you should see both the RSS address and address space are reduced the last time you call free .

When you explicitly call malloc_trim , it tries to circumvent this limitation using the Linux extension, madvise(MADV_DONTNEED) , which frees up physical RAM but not address space (as you noticed). I do not know why this only happens when malloc_trim explicitly called.

By the way, the mmap 8MB segment is for your initial allocation of array .

+4
source

All Articles