Java program runs slower when code that never runs is commented out

I observed some strange behavior in one of my Java programs. I tried to remove the code as much as possible while maintaining the ability to repeat the behavior. The full code is below.

public class StrangeBehaviour { static boolean recursionFlag = true; public static void main(String[] args) { long startTime = System.nanoTime(); for (int i = 0; i < 10000; i ++) { functionA(6, 0); } long endTime = System.nanoTime(); System.out.format("%.2f seconds elapsed.\n", (endTime - startTime) / 1000.0 / 1000 / 1000); } static boolean functionA(int recursionDepth, int recursionSwitch) { if (recursionDepth == 0) { return true; } return functionB(recursionDepth, recursionSwitch); } static boolean functionB(int recursionDepth, int recursionSwitch) { for (int i = 0; i < 16; i++) { if (StrangeBehaviour.recursionFlag) { if (recursionSwitch == 0) { if (functionA(recursionDepth - 1, 1 - recursionSwitch)) return true; } else { if (!functionA(recursionDepth - 1, 1 - recursionSwitch)) return false; } } else { // This block is never entered into. // Yet commenting out one of the lines below makes the program run slower! System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); } } return false; } } 

I have two functions: functionA() and functionB() , which access each other recursively. Both functions accept the recursionDepth parameter, which controls the completion of the recursion. functionA() calls functionB() at most once with recursionDepth unchanged. functionB() calls functionA() 16 times with recursionDepth - 1 . Recursion completes when functionA() is called with recursionDepth of 0 .

functionB() has a code block with the number of calls to System.out.println() . This block is never entered because the record is controlled by the variable boolean recursionFlag , which is set to true and never changes during program execution. However, commenting on even one of the println() calls makes the program run slower. On my machine, the runtime is <0.2 with all println() and> 2s calls present when one of the calls is commented out.

What could be causing this behavior? My only assumption is that there is some naive compiler optimization that is triggered by a parameter related to the length of the code block (or the number of function calls, etc.). Any further understanding of this would be greatly appreciated!

Edit: I am using JDK 1.8.

+56
java compiler-optimization recursion
Dec 31 '16 at 18:21
source share
3 answers

The complete answer is a combination of k5_ and Tony answers.

The code that the OP published does not contain a warm-up cycle to start compiling HotSpot before executing the reference; therefore, acceleration of 10 times (on my computer) when I turn on print statements combines both the time spent on HotSpot and the compilation of bytecode for CPU commands, as well as the actual launch of CPU instructions.

If I add a separate warm-up cycle before the synchronization cycle, there will only be 2.5 times acceleration in the print statement.

This indicates that HotSpot / JIT compilation takes longer when the method is built-in (as Tony explained), and that code execution takes longer, probably due to degraded cache performance or branch-prediction / pipelining, as k5_.

 public static void main(String[] args) { // Added the following warmup loop before the timing loop for (int i = 0; i < 50000; i++) { functionA(6, 0); } long startTime = System.nanoTime(); for (int i = 0; i < 50000; i++) { functionA(6, 0); } long endTime = System.nanoTime(); System.out.format("%.2f seconds elapsed.\n", (endTime - startTime) / 1000.0 / 1000 / 1000); } 
+21
Jan 02 '17 at 6:09 on
source share

Commented code affects processing processing. If function B becomes longer / larger (more bytecode instructions), it will not be included in the function.

So, @ J3D1 was able to use VMOptions to manually disable insertion for function B (): -XX:CompileCommand=dontinline,com.jd.benchmarking.StrangeBeh‌​aviour::functionB This seems to eliminate the delay with a shorter function.

with vm options that you can display the attachment -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining

Big version, no built-in function B

 @ 8 StrangeBehaviour::functionB (326 bytes) callee is too large @ 21 StrangeBehaviour::functionA (12 bytes) @ 8 StrangeBehaviour::functionB (326 bytes) callee is too large @ 35 StrangeBehaviour::functionA (12 bytes) @ 8 StrangeBehaviour::functionB (326 bytes) callee is too large 

The shorter version will attempt to embed function B, causing several further attempts.

 @ 8 StrangeBehaviour::functionB (318 bytes) inline (hot) @ 21 StrangeBehaviour::functionA (12 bytes) inline (hot) @ 8 StrangeBehaviour::functionB (318 bytes) inline (hot) @ 35 StrangeBehaviour::functionA (12 bytes) recursive inlining is too deep @ 35 StrangeBehaviour::functionA (12 bytes) inline (hot) @ 8 StrangeBehaviour::functionB (318 bytes) inline (hot) @ 21 StrangeBehaviour::functionA (12 bytes) recursive inlining is too deep @ 35 StrangeBehaviour::functionA (12 bytes) recursive inlining is too deep @ 21 StrangeBehaviour::functionA (12 bytes) inline (hot) @ 8 StrangeBehaviour::functionB (318 bytes) inline (hot) @ 35 StrangeBehaviour::functionA (12 bytes) inline (hot) @ 8 StrangeBehaviour::functionB (318 bytes) recursive inlining is too deep @ 35 StrangeBehaviour::functionA (12 bytes) inline (hot) @ 8 StrangeBehaviour::functionB (318 bytes) inline (hot) @ 21 StrangeBehaviour::functionA (12 bytes) inline (hot) @ 8 StrangeBehaviour::functionB (318 bytes) recursive inlining is too deep @ 35 StrangeBehaviour::functionA (12 bytes) inline (hot) @ 8 StrangeBehaviour::functionB (318 bytes) recursive inlining is too deep 

Mostly guessing, but the big / inline bytecode will cause branch prediction and caching problems

+41
Dec 31 '16 at 19:09
source share

As of @ k5_, it seems like there is a threshold to determine whether to embed a function. And if the JIT compiler decided to embed it, it would take a lot of work and time to do this, since -XX:+PrintCompilation shows:

  task-id 158 32 3 so_test.StrangeBehaviour::functionB (326 bytes) made not entrant 159 35 3 java.lang.String::<init> (82 bytes) 160 36 s 1 java.util.Vector::size (5 bytes) 1878 37 % 3 so_test.StrangeBehaviour::main @ 6 (65 bytes) 1898 38 3 so_test.StrangeBehaviour::main (65 bytes) 2665 39 3 java.util.regex.Pattern::has (15 bytes) 2667 40 3 sun.misc.FDBigInteger::mult (64 bytes) 2668 41 3 sun.misc.FDBigInteger::<init> (30 bytes) 2668 42 3 sun.misc.FDBigInteger::trimLeadingZeros (57 bytes) 2.51 seconds elapsed. 

The top is non-comment information, followed by a comment that reduces the size of the method from 326 bytes to 318 bytes. And you may notice that the task ID in column 1 of the output is much larger in the latter, which causes more time.

  task-id 126 35 4 so_test.StrangeBehaviour::functionA (12 bytes) 130 33 3 so_test.StrangeBehaviour::functionA (12 bytes) made not entrant 131 36 s 1 java.util.Vector::size (5 bytes) 14078 37 % 3 so_test.StrangeBehaviour::main @ 6 (65 bytes) 14296 38 3 so_test.StrangeBehaviour::main (65 bytes) 14296 39 % 4 so_test.StrangeBehaviour::functionB @ 2 (318 bytes) 14300 40 4 so_test.StrangeBehaviour::functionB (318 bytes) 14304 34 3 so_test.StrangeBehaviour::functionB (318 bytes) made not entrant 14628 41 3 java.util.regex.Pattern::has (15 bytes) 14631 42 3 sun.misc.FDBigInteger::mult (64 bytes) 14632 43 3 sun.misc.FDBigInteger::<init> (30 bytes) 14632 44 3 sun.misc.FDBigInteger::trimLeadingZeros (57 bytes) 14.50 seconds elapsed. 

And if you change the code to the following (add two lines and copy the print line), you will see that the code size is changed to 326 bytes and it runs faster:

  if (StrangeBehaviour.recursionFlag) { int a = 1; int b = 1; if (recursionSwitch == 0) { if (functionA(recursionDepth - 1, 1 - recursionSwitch)) return true; } else { if (!functionA(recursionDepth - 1, 1 - recursionSwitch)) return false; } } else { // This block is never entered into. // Yet commenting out one of the lines below makes the program run slower! System.out.println("..."); System.out.println("..."); System.out.println("..."); //System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); System.out.println("..."); } 

New time and JIT compiler info:

  140 34 3 so_test.StrangeBehaviour::functionB (326 bytes) made not entrant 145 36 3 java.lang.String::<init> (82 bytes) 148 37 s 1 java.util.Vector::size (5 bytes) 162 38 4 so_test.StrangeBehaviour::functionA (12 bytes) 163 33 3 so_test.StrangeBehaviour::functionA (12 bytes) made not entrant 1916 39 % 3 so_test.StrangeBehaviour::main @ 6 (65 bytes) 1936 40 3 so_test.StrangeBehaviour::main (65 bytes) 2686 41 3 java.util.regex.Pattern::has (15 bytes) 2689 42 3 sun.misc.FDBigInteger::mult (64 bytes) 2690 43 3 sun.misc.FDBigInteger::<init> (30 bytes) 2690 44 3 sun.misc.FDBigInteger::trimLeadingZeros (57 bytes) 2.55 seconds elapsed. 

In conclusion :

  • When the method size exceeds some restrictions, JIT will not inline this function.
  • And if we comment out a line that shrinks to a size below the threshold, JIT decides to embed it;
  • Embedding this function leads to many JIT tasks that slow down the program.



Update

Confirming my last test , the answer to this question is not so simple:

As my code example shows, the usual inline optimization will be

  • speeds up the program
  • and it doesn't cost much compiler work (in my test, it even costs less work when inline happens).

But in this problem, the code causes a lot of JIT work and slows down the program, which seems like a JIT error. And it is still unclear why this causes so much JIT work.

+18
Jan 01 '16 at 5:26
source share



All Articles