Zero large memory mapping with `madvise`

I have the following problem:

I allocate a large chunk of memory (multiple GiB) via mmap using MAP_ANONYMOUS . This piece contains a large hash map that needs to be reset from time to time. Not all matching can be used in every round (not every page fails), so memset not a good idea - it takes too much time.

What is the best strategy to do it fast?

Will be

 madvise(ptr, length, MADV_DONTNEED); 

ensure that any subsequent calls provide new blank pages?

On the Linux man madvise :

This call does not affect the semantics of the application (except in the case of MADV_DONTNEED ), but may affect its performance. The kernel may ignore the advice.

...

MADV_DONTNEED

Subsequent access to pages in this range will be successful, but will result in reloading the contents of the memory from the base mapped file (see mmap (2)) or zero-fill pages on demand for mappings without a base file.

...

The current version of Linux (2.4.0) sees this system more as a command than as advice ...

Or do I need to munmap and reassign the region again?

It should work on Linux and ideally have the same behavior on OS X.

+7
c virtual-memory mmap
source share
3 answers

There is a much easier solution to your problem, which is quite portable:

 mmap(ptr, length, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); 

Since MAP_FIXED allowed to crash for quite arbitrary reasons related to the implementation, returning to memset if it returns MAP_FAILED would be advisable.

+7
source share

This madvise behavior, of course, is not standard, so it will not be portable.

If the part you want to reset is at the end of your mapping, you can leave with ftruncate . You must enter one more step:

  • shm_open have a "persistent" file descriptor for your data.
  • ftruncate to size
  • mmap this fd

Then you can always

  • munmap
  • ftruncate something short
  • ftruncate to the actual length you need
  • mmap again

and then the part that you โ€œreassignedโ€ will be initialized to zero.

But also keep in mind that the system should do page zeroing. This may be a little more efficient than the inline material your compiler created for memset , but this is not entirely clear.

+1
source share

On Linux, you can rely on MADV_DONTNEED for an anonymous mapping to MADV_DONTNEED mapping. This is not portable, though - madvise() is not standardized per se. posix_madvise() standardized, but POSIX_MADV_DONTNEED does not have the same behavior as the Linux flag MADV_DONTNEED - posix_madvise() always advisory and does not affect application semantics.

+1
source share

All Articles