50 million are not particularly large. I would just read them in my memory. Sort them and write. It only takes a few seconds. How fast do you need it? How much do you need it?
On my old lab, it took 28 seconds. If I had more processors, it could be a little faster, but most of the time is spent reading and writing a file (15 seconds), which will not be faster.
One important factor is the size of your cache. The comparison itself is very cheap if the data is in the cache. Since the L3 cache is shared, a single thread is all you need to make full use of it.
public static void main(String...args) throws IOException { generateFile(); long start = System.currentTimeMillis(); int[] nums = readFile("numbers.bin"); Arrays.sort(nums); writeFile("numbers2.bin", nums); long time = System.currentTimeMillis() - start; System.out.println("Took "+time+" secs to sort "+nums.length+" numbers."); } private static void generateFile() throws IOException { Random rand = new Random(); int[] ints = new int[50*1000*1000]; for(int i= 0;i<ints.length;i++) ints[i] = rand.nextInt(); writeFile("numbers.bin", ints); } private static int[] readFile(String filename) throws IOException { DataInputStream dis = new DataInputStream(new BufferedInputStream(new FileInputStream(filename), 64*1024)); int len = dis.readInt(); int[] ints = new int[len]; for(int i=0;i<len;i++) ints[i] = dis.readInt(); return ints; } private static void writeFile(String name, int[] numbers) throws IOException { DataOutputStream dos = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(name), 64*1024)); dos.writeInt(numbers.length); for (int number : numbers) dos.writeInt(number); dos.close(); }
Peter Lawrey
source share