Let's say you do some calculations on a large set of large float vectors, for example. calculating the average value of each of them:
public static float avg(float[] data, int offset, int length) { float sum = 0; for (int i = offset; i < offset + length; i++) { sum += data[i]; } return sum / length; }
If you have all of your vectors stored in float[] , you can implement the loop as follows:
float[] data; // <-- vectors here float sum = 0; for (int i = 0; i < nVectors; i++) { sum += avg(data, i * vectorSize, vectorSize); }
If your vectors are stored in a file instead, memory matching should be as fast as the first solution, in theory , as soon as the OS caches all of this:
RandomAccessFile file; // <-- vectors here MappedByteBuffer buffer = file.getChannel().map(READ_WRITE, 0, 4*data.length); FloatBuffer floatBuffer = buffer.asFloatBuffer(); buffer.load(); // <-- this forces the OS to cache the file float[] vector = new float[vectorSize]; float sum = 0; for (int i = 0; i < nVectors; i++) { floatBuffer.get(vector); sum += avg(vector, 0, vector.length); }
However, my tests show that the version with memory mapping is ~ 5 times slower than in memory. I know that FloatBuffer.get(float[]) copying memory, and I think the reason for the slowdown. Could it be faster? Is there a way to avoid copying any memory and just get my data from the OS buffer?
I loaded my full test into this method , if you want to try just running it:
$ java -Xmx1024m ArrayVsMMap 100 100000 100
Edit:
In the end, the best I could get from MappedByteBuffer in this scenario is still slower than using regular float[] by ~ 35%. Tricks so far:
- use your own byte order to avoid conversion:
buffer.order(ByteOrder.nativeOrder()) - wrap
MappedByteBuffer with FloatBuffer using buffer.asFloatBuffer() - use a simple
floatBuffer.get(int index) instead of the mass version, this avoids copying memory.
You can see the new test and the results of this meaning .
A slowdown of 1.35 is much better than one of 5, but it is still far from 1. I will probably still miss something, otherwise it is something in the JVM that needs to be improved.