I am trying to check how fast Java can accomplish a simple task: read a huge file in memory and then do some meaningless data calculations. All types of optimization are taken into account. Regardless of whether he rewrites the code differently or uses a different JVM, spoofing the JIT ..
The input file is a 500-million long list of 32-bit integer pairs, separated by commas. Like this:
44,439.5023
33140,22257
...
This file takes 5.5 GB on my machine. The program cannot use more than 8 GB of RAM and can only use a single stream .
package speedracer; import java.io.FileInputStream; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; public class Main { public static void main(String[] args) { int[] list = new int[1000000000]; long start1 = System.nanoTime(); parse(list); long end1 = System.nanoTime(); System.out.println("Parsing took: " + (end1 - start1) / 1000000000.0); int rs = 0; long start2 = System.nanoTime(); for (int k = 0; k < list.length; k++) { rs = calc(list[k++], list[k++], list[k++], list[k]); } long end2 = System.nanoTime(); System.out.println(rs); System.out.println("Calculations took: " + (end2 - start2) / 1000000000.0); } public static int calc(final int a1, final int a2, final int b1, final int b2) { int c1 = (a1 + a2) ^ a2; int c2 = (b1 - b2) << 4; for (int z = 0; z < 100; z++) { c1 ^= z + c2; } return c1; } public static void parse(int[] list) { FileChannel fc = null; int i = 0; MappedByteBuffer byteBuffer; try { fc = new FileInputStream("in.txt").getChannel(); long size = fc.size(); long allocated = 0; long allocate = 0; while (size > allocated) { if ((size - allocated) > Integer.MAX_VALUE) { allocate = Integer.MAX_VALUE; } else { allocate = size - allocated; } byteBuffer = fc.map(FileChannel.MapMode.READ_ONLY, allocated, allocate); byteBuffer.clear(); allocated += allocate; int number = 0; while (byteBuffer.hasRemaining()) { char val = (char) byteBuffer.get(); if (val == '\n' || val == ',') { list[i] = number; number = 0; i++; } else { number = number * 10 + (val - '0'); } } } fc.close(); } catch (Exception e) { System.err.println("Parsing error: " + e); } } }
I tried everything I could think of. Having tried different readers, I tried openjdk6, sunjdk6, sunjdk7. Tried different readers. We'll have to do some ugly disassembly, since MappedByteBuffer cannot display more than 2 GB of memory at once. I'm runing:
Linux AS292 2.6.38-11-generic
Currently, my results for parsing: 26.50s, calculations: 11.27s. I am competing with a similar C ++ benchmark that does IO at about the same time, but the calculations only take 4.5 s. My main goal is to reduce the computation time in any way possible. Any ideas?
Update: It seems that a major speed improvement may come from what is called Auto-Vectorization . I was able to find some hints that the current Sun JIT is only doing "some vectology", however I cannot confirm this. It would be great to find some JVMs or JITs that would have better support for auto-vectorization optimization.