Why is branch prediction faster than no branch?

Inspired by this question: Why is a sorted array faster to process than an unsorted array?

I wrote my own branch prediction experiment:

public class BranchPrediction { public static void main(final String[] args) { long start; long sum = 0; /* No branch */ start = System.nanoTime(); sum = 0; for (long i = 0; i < 10000000000L; ++i) sum += i; System.out.println(System.nanoTime() - start); System.out.println(sum); /* With branch */ start = System.nanoTime(); sum = 0; for (long i = 0; i < 10000000000L; ++i) if (i >= 0) sum += i; System.out.println(System.nanoTime() - start); System.out.println(sum); /* No branch (again) */ start = System.nanoTime(); sum = 0; for (long i = 0; i < 10000000000L; ++i) sum += i; System.out.println(System.nanoTime() - start); System.out.println(sum); /* With branch (again) */ start = System.nanoTime(); sum = 0; for (long i = 0; i < 10000000000L; ++i) if (i >= 0) sum += i; System.out.println(System.nanoTime() - start); System.out.println(sum); } } 

The result confuses me: in accordance with the output of the program, a loop with a branch is reliably faster than branch loops.

Output Example:

 7949691477 -5340232226128654848 6947699555 -5340232226128654848 7920972795 -5340232226128654848 7055459799 -5340232226128654848 

Why is this so?

Edit:

+7
source share
2 answers

After starting the same experiment on other computers (servers and Intel workstations), I can conclude that the phenomenon that I experienced is specific to this notebook processor (Intel i7 Q740M).

==== 6 months later editing ====

Check it out: http://eli.thegreenplace.net/2013/12/03/intel-i7-loop-performance-anomaly/

+2
source

Keep in mind that the JVM optimizes execution internally, and there are caches on your computer that speed up computing. Since you have such a powerful processor (many independent cores), this is not strange. Also note that there is code that runs under Java code that maps to your computerโ€™s machine code. Just enter the code you can optimize to let the JVM worry about it.

EDIT: Machines and equipment, such as heavy loads, work with greater efficiency. Especially the cache.

+2
source

All Articles