Java NIO - memory mapped files

I recently came across this article which provided a nice introduction to memory mapping files and how it can be shared between two processes. Here is the code for the process that is being read in the file:

import java.io.File; import java.io.FileNotFoundException; import java.io.IOException; import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; public class MemoryMapReader { /** * @param args * @throws IOException * @throws FileNotFoundException * @throws InterruptedException */ public static void main(String[] args) throws FileNotFoundException, IOException, InterruptedException { FileChannel fc = new RandomAccessFile(new File("c:/tmp/mapped.txt"), "rw").getChannel(); long bufferSize=8*1000; MappedByteBuffer mem = fc.map(FileChannel.MapMode.READ_ONLY, 0, bufferSize); long oldSize=fc.size(); long currentPos = 0; long xx=currentPos; long startTime = System.currentTimeMillis(); long lastValue=-1; for(;;) { while(mem.hasRemaining()) { lastValue=mem.getLong(); currentPos +=8; } if(currentPos < oldSize) { xx = xx + mem.position(); mem = fc.map(FileChannel.MapMode.READ_ONLY,xx, bufferSize); continue; } else { long end = System.currentTimeMillis(); long tot = end-startTime; System.out.println(String.format("Last Value Read %s , Time(ms) %s ",lastValue, tot)); System.out.println("Waiting for message"); while(true) { long newSize=fc.size(); if(newSize>oldSize) { oldSize = newSize; xx = xx + mem.position(); mem = fc.map(FileChannel.MapMode.READ_ONLY,xx , oldSize-xx); System.out.println("Got some data"); break; } } } } } } 

However, I have a few comments / questions regarding this approach:

If we only read in an empty file, then run

  long bufferSize=8*1000; MappedByteBuffer mem = fc.map(FileChannel.MapMode.READ_ONLY, 0, bufferSize); long oldSize=fc.size(); 

8000 bytes will be allocated here, which will now expand the file. The buffer that is returned has a limit of 8000 and a position of 0, so the reader can continue reading clean data. After that, the reader will stop as currentPos == oldSize .

Presumably, the author now appears (the code is skipped, since most of them are simple and can be specified from the site) - he uses the same buffer size, so he will write the first 8000 bytes, and then allocates another 8000, expanding the file. Now, if we assume that this process stops at this moment, and we return to the reader, the reader sees the new file size and selects the remainder (so from position 8000 to 1600) and starts reading again, reading in another garbage ...

I am a little confused if there is a reason for synchronizing these two operations. As far as I can see, any map call can expand the file with a really empty buffer (filled with zeros), or the author could just decrypt the file, but haven’t written anything yet ...

+6
source share
3 answers

There are several ways.

  • Let the writer acquire an exclusive Lock in a region that has not yet been written. Release the lock when everything has been written. This is compatible with any other application running on this system, but it requires the reader to be smart enough to retry reading errors if you do not combine it with one of the other methods

  • Use a different communication channel, for example. a channel or socket or file metadata channel to allow the writer to tell the reader about the completed recording.

  • Write in the position in the file a special marker (which is part of the protocol) reporting the recorded data, for example

     MappedByteBuffer bb; … // write your data bb.force();// ensure completion of all writes bb.put(specialPosition, specialMarkerValue); bb.force();// ensure visibility of the marker 
+2
source

I work a lot with memory mapped files for interprocess communication. I would not recommend Holger # 1 or # 2, but its No. 3 is what I do. But the key point is that I only ever work with one author - everything gets complicated if you have several authors.

The beginning of the file is a header section with any necessary header variables, and most importantly, a pointer to the end of the recorded data. The writer should always update this header variable after writing a piece of data, and the reader should never read this variable. The thing called "cache coherency" that is used by all major processors ensures that the reader sees that the memory is written in the same order in which they are written, so the reader will never read uninitialized memory if you follow these rules. (The exception is that the reader and the writers are on different servers - the cache coherency does not work there. Do not try to implement shared memory on different servers!)

There is no limit to how often you can update the end-of-file pointer - all this is in memory and no inputs / outputs will be involved, so you can update every entry or every message that you write.

ByteBuffer has versions of the getInt () and 'putInt () methods that take an absolute byte offset, so I use the end-of-file marker to read and write ... I never use the relative version when working with memory mapped files.

You cannot use file size or another interprocess communication method to pass the end of file marker and without the need or benefit when you already have shared memory.

+11
source

Check out my Mappedbus library ( http://github.com/caplogic/mappedbus ), which allows multiple Java processes (JVMs) to write entries to display the same memory file.

Here, Mappedbus solves the synchronization problem between several authors:

  • The first eight bytes of the file make up a field called the limit. This field indicates how much data has actually been written to the file. Readers will test the marginal field (using volatile) to see if there is a new record to read.

  • When a writer wants to add an entry to a file, he will use the fetch-and-add statement to atomically update the restriction field.

  • When the limit field has increased, the reader will know that new data must be read, but the writer who updated the restriction field may not yet write any data to the record. To avoid this problem, each record contains an initial byte, which constitutes a commit field.

  • When the writer finishes recording the record, he will set the commit field (using volatile), and the reader will only start reading the record after he sees that the commit field is set.

(BTW, the solution was tested only for working with Linux x86 with Oracle JVM. Most likely, it will not work on all platforms).

+5
source

All Articles