I am experimenting with parallelizing algorithms in Java. I started by merging sorting and posted my attempt at this question . My revised attempt is in the code below, where now I am trying to parallelize a quick sort.
Are there any newbie bugs in my multi-threaded implementation or approach to this problem? If not, should you expect more than a 32% increase in speed between the serial and parallelized duel algorithm (see Timings below)?
Here is the multithreading algorithm:
public class ThreadedQuick extends Thread { final int MAX_THREADS = Runtime.getRuntime().availableProcessors(); CountDownLatch doneSignal; static int num_threads = 1; int[] my_array; int start, end; public ThreadedQuick(CountDownLatch doneSignal, int[] array, int start, int end) { this.my_array = array; this.start = start; this.end = end; this.doneSignal = doneSignal; } public static void reset() { num_threads = 1; } public void run() { quicksort(my_array, start, end); doneSignal.countDown(); num_threads--; } public void quicksort(int[] array, int start, int end) { int len = end-start+1; if (len <= 1) return; int pivot_index = medianOfThree(array, start, end); int pivotValue = array[pivot_index]; swap(array, pivot_index, end); int storeIndex = start; for (int i = start; i < end; i++) { if (array[i] <= pivotValue) { swap(array, i, storeIndex); storeIndex++; } } swap(array, storeIndex, end); if (num_threads < MAX_THREADS) { num_threads++; CountDownLatch completionSignal = new CountDownLatch(1); new ThreadedQuick(completionSignal, array, start, storeIndex - 1).start(); quicksort(array, storeIndex + 1, end); try { completionSignal.await(1000, TimeUnit.SECONDS); } catch(Exception ex) { ex.printStackTrace(); } } else { quicksort(array, start, storeIndex - 1); quicksort(array, storeIndex + 1, end); } } }
Here's how I get started:
ThreadedQuick.reset(); CountDownLatch completionSignal = new CountDownLatch(1); new ThreadedQuick(completionSignal, array, 0, array.length-1).start(); try { completionSignal.await(1000, TimeUnit.SECONDS); } catch(Exception ex){ ex.printStackTrace(); }
I tested this against an Arrays.sort array and a similar quick sort algorithm. Below are the results of synchronization on an Intel duel-core dell laptop, in seconds:
Elements: 500,000, serial: 0,068592, thread: 0.046871, Arrays. Grade: 0.079677
Elements: 1,000,000, serial: 0.144416, thread: 0.095492, Arrays. Grade: 0.167155
Elements: 2,000,000, serial: 0,301666, thread: 0,205719, Arrays. Grade: 0.350982
Elements: 4,000,000, serial: 0.623291, thread: 0.424119, Arrays. Grade: 0.712698
Elements: 8,000,000, serial: 1.279374, thread: 0.859363, Arrays.sort: 1.487671
Each number above is the average time of 100 tests, throwing out the 3 lowest and 3 highest. I used Random.nextInt (Integer.MAX_VALUE) to create an array for each test that was initialized every 10 tests with the same seed. Each test consisted of synchronizing this algorithm with System.nanoTime. After averaging, I rounded to six decimal places. And, obviously, I checked if each view works.
As you can see, on each test suite, the speed between consecutive and multi-threaded cases increases by about 32%. As I said above, should one expect more?