Does ExecutorService.submit (<callable>) take longer?
I try to understand the utilities in the java.util.concurrent package and find out that we can send callable objects to an ExecutorService that returns a Future that is populated with the value returned by callable after the task completes successfully within the call() method.
I understand that all called calls are executed simultaneously using multiple threads.
When I wanted to see how many enhancements the ExecutorService provides for performing a batch task, I thought about the capture time.
Below is the code I tried to execute -
package concurrency; import java.util.ArrayList; import java.util.List; import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Future; public class ExecutorExample { private static Callable<String> callable = new Callable<String>() { @Override public String call() throws Exception { StringBuilder builder = new StringBuilder(); for(int i=0; i<5; i++) { builder.append(i); } return builder.toString(); } }; public static void main(String [] args) { long start = System.currentTimeMillis(); ExecutorService service = Executors.newFixedThreadPool(5); List<Future<String>> futures = new ArrayList<Future<String>>(); for(int i=0; i<5; i++) { Future<String> value = service.submit(callable); futures.add(value); } for(Future<String> f : futures) { try { System.out.println(f.isDone() + " " + f.get()); } catch (InterruptedException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (ExecutionException e) { // TODO Auto-generated catch block e.printStackTrace(); } } long end = System.currentTimeMillis(); System.out.println("Executer callable time - " + (end - start)); service.shutdown(); start = System.currentTimeMillis(); for(int i=0; i<5; i++) { StringBuilder builder = new StringBuilder(); for(int j=0; j<5; j++) { builder.append(j); } System.out.println(builder.toString()); } end = System.currentTimeMillis(); System.out.println("Normal time - " + (end - start)); } } and here is the result of this -
true 01234 true 01234 true 01234 true 01234 true 01234 Executer callable time - 5 01234 01234 01234 01234 01234 Normal time - 0 Please let me know if I am missing something or misunderstanding something.
Thank you in advance for your time and help in this topic.
When you run any esp code for the first time, it takes time. If you transfer the task to another thread, it can take 1-10 microseconds, and if your task takes less time, the overhead can be more than good. that is, using multiple threads can be much slower than using a single thread if your overhead is high enough.
I suggest you
- increase the cost of the task to 1000 iterations.
- make sure the result is not discarded in a single-threaded example
- run both tests for at least a couple of seconds to ensure that the code warms up.
Not an answer (but I'm not sure if the code will match the comment). To talk a little about what Peter said, there is usually a nice place for the size of your assignments (measured at runtime) to balance the pool / queue overhead with the distribution of fair work among workers. Sample code helps you find the mark for this sweet spot. Run on your target hardware.
import java.util.concurrent.*; import java.util.concurrent.atomic.*; public class FibonacciFork extends RecursiveTask<Long> { private static final long serialVersionUID = 1L; public FibonacciFork( long n) { super(); this.n = n; } static ForkJoinPool fjp = new ForkJoinPool( Runtime.getRuntime().availableProcessors()); static long fibonacci0( long n) { if ( n < 2) { return n; } return fibonacci0( n - 1) + fibonacci0( n - 2); } static int rekLimit = 8; private static long stealCount; long n; private long forkCount; private static AtomicLong forks = new AtomicLong( 0); public static void main( String[] args) { int n = 45; long times[] = getSingleThreadNanos( n); System.out.println( "Single Thread Times complete"); for ( int r = 2; r <= n; r++) { runWithRecursionLimit( r, n, times[ r]); } } private static long[] getSingleThreadNanos( int n) { final long times[] = new long[ n + 1]; ExecutorService es = Executors.newFixedThreadPool( Math.max( 1, Runtime.getRuntime().availableProcessors() / 2)); for ( int i = 2; i <= n; i++) { final int arg = i; Runnable runner = new Runnable() { @Override public void run() { long start = System.nanoTime(); final int minRuntime = 1000000000; long runUntil = start + minRuntime; long result = fibonacci0( arg); long end = System.nanoTime(); int ntimes = Math.max( 1, ( int) ( minRuntime / ( end - start))); if ( ntimes > 1) { start = System.nanoTime(); for ( int i = 0; i < ntimes; i++) { result = fibonacci0( arg); } end = System.nanoTime(); } times[ arg] = ( end - start) / ntimes; } }; es.execute( runner); } es.shutdown(); try { es.awaitTermination( 1, TimeUnit.HOURS); } catch ( InterruptedException e) { System.out.println( "Single Timeout"); } return times; } private static void runWithRecursionLimit( int r, int arg, long singleThreadNanos) { rekLimit = r; long start = System.currentTimeMillis(); long result = fibonacci( arg); long end = System.currentTimeMillis(); // Steals zΓ€hlen long currentSteals = fjp.getStealCount(); long newSteals = currentSteals - stealCount; stealCount = currentSteals; long forksCount = forks.getAndSet( 0); System.out.println( "Fib(" + arg + ")=" + result + " in " + ( end-start) + "ms, recursion limit: " + r + " at " + ( singleThreadNanos / 1e6) + "ms, steals: " + newSteals + " forks " + forksCount); } static long fibonacci( final long arg) { FibonacciFork task = new FibonacciFork( arg); long result = fjp.invoke( task); forks.set( task.forkCount); return result; } @Override protected Long compute() { if ( n <= rekLimit) { return fibonacci0( n); } FibonacciFork ff1 = new FibonacciFork( n-1); FibonacciFork ff2 = new FibonacciFork( n-2); ff1.fork(); long r2 = ff2.compute(); long r1 = ff1.join(); forkCount = ff2.forkCount + ff1.forkCount + 1; return r1 + r2; } }