Java: parallelizing fast sort through multithreading

I am experimenting with parallelizing algorithms in Java. I started by merging sorting and posted my attempt at this question . My revised attempt is in the code below, where now I am trying to parallelize a quick sort.

Are there any newbie bugs in my multi-threaded implementation or approach to this problem? If not, should you expect more than a 32% increase in speed between the serial and parallelized duel algorithm (see Timings below)?

Here is the multithreading algorithm:

public class ThreadedQuick extends Thread { final int MAX_THREADS = Runtime.getRuntime().availableProcessors(); CountDownLatch doneSignal; static int num_threads = 1; int[] my_array; int start, end; public ThreadedQuick(CountDownLatch doneSignal, int[] array, int start, int end) { this.my_array = array; this.start = start; this.end = end; this.doneSignal = doneSignal; } public static void reset() { num_threads = 1; } public void run() { quicksort(my_array, start, end); doneSignal.countDown(); num_threads--; } public void quicksort(int[] array, int start, int end) { int len = end-start+1; if (len <= 1) return; int pivot_index = medianOfThree(array, start, end); int pivotValue = array[pivot_index]; swap(array, pivot_index, end); int storeIndex = start; for (int i = start; i < end; i++) { if (array[i] <= pivotValue) { swap(array, i, storeIndex); storeIndex++; } } swap(array, storeIndex, end); if (num_threads < MAX_THREADS) { num_threads++; CountDownLatch completionSignal = new CountDownLatch(1); new ThreadedQuick(completionSignal, array, start, storeIndex - 1).start(); quicksort(array, storeIndex + 1, end); try { completionSignal.await(1000, TimeUnit.SECONDS); } catch(Exception ex) { ex.printStackTrace(); } } else { quicksort(array, start, storeIndex - 1); quicksort(array, storeIndex + 1, end); } } } 

Here's how I get started:

 ThreadedQuick.reset(); CountDownLatch completionSignal = new CountDownLatch(1); new ThreadedQuick(completionSignal, array, 0, array.length-1).start(); try { completionSignal.await(1000, TimeUnit.SECONDS); } catch(Exception ex){ ex.printStackTrace(); } 

I tested this against an Arrays.sort array and a similar quick sort algorithm. Below are the results of synchronization on an Intel duel-core dell laptop, in seconds:

Elements: 500,000, serial: 0,068592, thread: 0.046871, Arrays. Grade: 0.079677

Elements: 1,000,000, serial: 0.144416, thread: 0.095492, Arrays. Grade: 0.167155

Elements: 2,000,000, serial: 0,301666, thread: 0,205719, Arrays. Grade: 0.350982

Elements: 4,000,000, serial: 0.623291, thread: 0.424119, Arrays. Grade: 0.712698

Elements: 8,000,000, serial: 1.279374, thread: 0.859363, Arrays.sort: 1.487671

Each number above is the average time of 100 tests, throwing out the 3 lowest and 3 highest. I used Random.nextInt (Integer.MAX_VALUE) to create an array for each test that was initialized every 10 tests with the same seed. Each test consisted of synchronizing this algorithm with System.nanoTime. After averaging, I rounded to six decimal places. And, obviously, I checked if each view works.

As you can see, on each test suite, the speed between consecutive and multi-threaded cases increases by about 32%. As I said above, should one expect more?

+6
java multithreading parallel-processing quicksort
source share
3 answers

Creating static numThreads can cause problems, it is very likely that at some point you will have more MAX_THREADS.

Probably the reason you are not getting full doubled performance is because your quick sort cannot be completely parallelized. Note that the first quicksort call will go through the entire array in the initial thread before it actually starts running in parallel. There is also the overhead of parallelizing the algorithm in the form of context switching and transitions in the transition mode to individual threads.

Look at the Fork / Join structure, this problem will probably be pretty neat there.

A few points to implement. Deploy Runnable, Do Not Extend Thread. The thread extension should only be used when creating a new version of the Thread class. When you just want to do some work that will be done in parallel, you will be better off working with Runnable. When using Runnable, you can also extend another class that gives you more flexibility in OO design. Use a thread pool that is limited by the number of threads available on the system. Also, do not use numThreads to decide whether to disable the new thread or not. You can calculate this front. Use the minimum partition size, which is the size of the total array divided by the number of processors available. Something like:

 public class ThreadedQuick implements Runnable { public static final int MAX_THREADS = Runtime.getRuntime().availableProcessors(); static final ExecutorService executor = Executors.newFixedThreadPool(MAX_THREADS); final int[] my_array; final int start, end; private final int minParitionSize; public ThreadedQuick(int minParitionSize, int[] array, int start, int end) { this.minParitionSize = minParitionSize; this.my_array = array; this.start = start; this.end = end; } public void run() { quicksort(my_array, start, end); } public void quicksort(int[] array, int start, int end) { int len = end - start + 1; if (len <= 1) return; int pivot_index = medianOfThree(array, start, end); int pivotValue = array[pivot_index]; swap(array, pivot_index, end); int storeIndex = start; for (int i = start; i < end; i++) { if (array[i] <= pivotValue) { swap(array, i, storeIndex); storeIndex++; } } swap(array, storeIndex, end); if (len > minParitionSize) { ThreadedQuick quick = new ThreadedQuick(minParitionSize, array, start, storeIndex - 1); Future<?> future = executor.submit(quick); quicksort(array, storeIndex + 1, end); try { future.get(1000, TimeUnit.SECONDS); } catch (Exception ex) { ex.printStackTrace(); } } else { quicksort(array, start, storeIndex - 1); quicksort(array, storeIndex + 1, end); } } } 

You can execute it:

 ThreadedQuick quick = new ThreadedQuick(array / ThreadedQuick.MAX_THREADS, array, 0, array.length - 1); quick.run(); 

This will start sorting in the same stream, which avoids the unnecessary host of the stream at startup.

Caution: not sure if the above implementation will actually be faster since I did not compare it.

+10
source share

A combination of quicksort and merge sort is used.

 import java.util.Arrays; import java.util.Random; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.TimeUnit; public class ParallelSortMain { public static void main(String... args) throws InterruptedException { Random rand = new Random(); final int[] values = new int[100*1024*1024]; for (int i = 0; i < values.length; i++) values[i] = rand.nextInt(); int threads = Runtime.getRuntime().availableProcessors(); ExecutorService es = Executors.newFixedThreadPool(threads); int blockSize = (values.length + threads - 1) / threads; for (int i = 0; i < values.length; i += blockSize) { final int min = i; final int max = Math.min(min + blockSize, values.length); es.submit(new Runnable() { @Override public void run() { Arrays.sort(values, min, max); } }); } es.shutdown(); es.awaitTermination(10, TimeUnit.MINUTES); for (int blockSize2 = blockSize; blockSize2 < values.length / 2; blockSize2 *= 2) { for (int i = 0; i < values.length; i += blockSize2) { final int min = i; final int mid = Math.min(min + blockSize2, values.length); final int max = Math.min(min + blockSize2 * 2, values.length); mergeSort(values, min, mid, max); } } } private static boolean mergeSort(int[] values, int left, int mid, int end) { int[] results = new int[end - left]; int l = left, r = mid, m = 0; for (; l < left && r < mid; m++) { int lv = values[l]; int rv = values[r]; if (lv < rv) { results[m] = lv; l++; } else { results[m] = rv; r++; } } while (l < mid) results[m++] = values[l++]; while (r < end) results[m++] = values[r++]; System.arraycopy(results, 0, values, left, results.length); return false; } } 
+3
source share

A couple of comments, if I understand your code correctly:

  • I do not see locks around the numthreads object, although it can be accessed through multiple threads. Perhaps you should do this AtomicInteger.

  • Use a thread pool and organize tasks, i.e. one quicksort call, to take advantage of the thread pool. Use futures.

Your current method of dividing things the way you do may leave a smaller division with stream and larger division without stream. In other words, he does not prioritize larger segments with his flows.

+1
source share

All Articles