Why does a java memory buffer call a massive unexpected IO disk?

I have written several Posix programs that use a mapped file buffer. One simple scenario is to map a 1 GB file to memory and fill the entire file with content.

At run time, there was a slightly smaller IO disk until a msync or munmap call was munmap .

On the exact same system, I wrote an equivalent Java program running on Oracle JDK 7 and noticed a huge amount of disk I / O during the entire program.

How is a memory mapped buffer implemented differently in the JVM? Still, is it possible to postpone the mass actions of IO?

The operating system is Linux 3.2 x64.

the code:

 import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; public class Main { public static void main(String[] args) throws Exception { long size = 1024 * 1048576; RandomAccessFile raf= new RandomAccessFile("mmap1g", "rw"); FileChannel fc = raf.getChannel(); MappedByteBuffer buf = fc.map(FileChannel.MapMode.READ_WRITE, 0, size); for(long i = 0; i < size; ++i) buf.put((byte)1); } } 
+8
java performance linux memory file-io
source share
1 answer

Memory mapping is fully implemented in the OS. The JVM does not have the right to indicate how it is flushed to disk, except by using the force() and "rws" when you select a file.

Linux will boot to disk based on the kernel parameters set in sysctl.

 $ sysctl -a | grep dirty vm.dirty_background_bytes = 0 vm.dirty_background_ratio = 10 vm.dirty_bytes = 0 vm.dirty_expire_centisecs = 3000 vm.dirty_ratio = 20 vm.dirty_writeback_centisecs = 500 

These are the default settings on my laptop. A factor of 10 means that it will start writing data to the disk in the background when 10% of the main memory is dirty. 20% feedback means that the recording program will stop if it does not fall to 20%. In any case, the data will be written to disk after 3000 centi-seconds or 30 seconds.


An interesting comparison, it maps the file to the tmpfs file system in tmpfs . I have /tmp installed as tmpfs, but most systems have / dev / shm.


BTW You may find this class interesting. MemoryStore allows you to map any memory size, i.e. β†’ 2 GB, and perform thread safe work on it. for example, you can share memory between processes. It supports heap lock, mutable read / write, ordered write, and CAS.

I have a test where two processes lock, switch, unlock records and latency is 50 ns on average on my laptop.

BTW2: Linux has sparse files, which means that you can display in regions not only more than your main memory, but also more than free disk space. for example, if you dial an 8 TB card and use only 4 GB random fragments, it will use up to 4 GB in memory and 4 GB on disk. If you use du {file} , you can see the actual space. Note: lazy allocation of disk space can lead to highly fragmented files, which can be a performance issue for the hard disk.

+8
source share

All Articles