Android: why native code is much faster than Java code

In the following SO question: https://stackoverflow.com/a/312616/2128/javascript claims that the java-blur algorithm port for C is 40 times faster.

Given that the main part of the code includes only calculations, and all distributions are performed only β€œonce” before the actual number of algorithms crunches - can anyone explain why this code works 40 times faster? Should Dalvik JIT translate bytecode and significantly reduce the gap to its own compiled code rate?

Note. I did not confirm the x40 performance gain myself with this algorithm, but all the serious image manipulation algorithms that I encounter for Android use NDK, so this supports the idea that the NDK code will work much faster.

+8
performance android android-ndk dalvik jit
source share
2 answers

For algorithms that work with datasets, there are two things that significantly change performance between languages, such as Java, and C:

  • check for array binding. Java checks every access, bmap [i] and confirms that I am within the array. If the code tries to access outside, you will get a useful exception. C and C ++ do not check anything and trust your code. The best answer to accessing overseas access is a page error. A more likely outcome is "unexpected behavior."

  • pointers. You can significantly reduce operations with pointers.

Take this innocent example of a generic filter (similar to blur, but 1D):

for{i=0; i<ndata-ncoef; ++i) { z[i] = 0; for{k=0; k<ncoef; ++k) { z[i] += c[k] * d[i+k]; } } 

When you access an element of an array, coef [k]:

  • coef array load address in register
  • load value k in register
  • sum them up
  • go get memory at this address

Each of these array calls can be improved since you know that indexes are sequential. The compiler or JIT may know that the indices are sequential, so they cannot be fully optimized (although they keep trying).

In C ++, you write code something like this:

 int d[10000]; int z[10000]; int coef[10]; int* zptr; int* dptr; int* cptr; dptr = &(d[0]); // Just being overly explicit here, more likely you would dptr = d; zptr = &(z[0]); // or zptr = z; for{i=0; i<(ndata-ncoef); ++i) { *zptr = 0; *cptr = coef; *dptr = d + i; for{k=0; k<ncoef; ++k) { *zptr += *cptr * *dptr; cptr++; dptr++; } zptr++; } 

When you first do something like this (and succeed in using it correctly), you will be surprised how much faster it can be. All array address calculations to extract the index and sum the index and base address are replaced by an increment statement.

For 2D array operations, such as image blur, the data of innocent codes [r, c] are associated with two values, multiplication and sum. Thus, using 2D arrays, the advantages of pointers can remove multiplication operations.

Thus, the language can really reduce the operations that the processor must perform. The cost is that C ++ code is terrible for reading and debugging. Errors in pointers and buffer overflows are food for hackers. But when it comes to raw chopping algorithms, speed improvement is too tempting to ignore.

+13
source share

Below is a list of level-based programming languages,

  • Assembly language (machine language, lover level)
  • C language (intermediate level)
  • C ++, Java, .net, (higher level)

Here, the lower level language has direct access to equipment. As the level increases, access to equipment decreases. Thus, the assembly language code works at maximum speed, and the other language code works based on their levels.

For this reason, C code is much faster than Java code.

-nine
source share

All Articles