Python is very slow compared to Java for this algorithm

I study algorithms and decided to port Java programs from the tutorial to Python, since I don't like the Java overhead, especially for small programs and as an exercise.

The algorithm itself is very simple, it simply takes all the triplets from the array using the bruteforce method and calculates how many triplets are summed to zero (for example: [-2,7, -5])

public static int count(int[] a) { int N = a.length; int cnt = 0; for (int i = 0; i < N; i++) { for (int j = i+1; j < N; j++) { for (int k = j+1; k < N; k++) { if (a[i] + a[j] + a[k] == 0) { cnt++; } } } } return cnt; } 

I ported it:

 def count(a): cnt = 0 ln = len(a) for i in xrange(0,ln): for j in xrange(i + 1,ln): for k in xrange(j + 1,ln): if a[i] + a[j] + a[k] == 0: cnt+=1 return cnt 

Now measurements of only these functions take:

 java : array of 2000 elements --> 3 seconds python : array of 2000 elements --> 2 minutes, 19 seconds UPDATE python (pypy) : array of 2000 elements --> 4 seconds ( :-) ) 

Of course, this is not a good algorithm, it just shows both here and in the tutorial. I used to program in both Java and Python, but I did not know about this huge difference.

The question boils down to: how to present it? More specific:

  • Is this code a good port, or am I missing something trivial?
  • Switching to another Jython execution, for example, a solution? Is it easy to save my codebase in eclipse and just add an interpreter (compiler?)? Or will switching to another interpreter / compiler only improve the situation?

I am currently using python 2.7.3 and Java 1.7 32bts on Windows 7.

I know that there are similar questions about SO about java / python performance, but the answers, as they are, in the python environment for different applications, are not very useful for me now.

What do I want to know if some of these work times can close this huge gap and deserve attention?

UPDATE:

I installed pypy and now the differences are huge ...

UPDATE 2:

Some very interesting things that I noticed: the islice method in the answer here is faster on "regular" python, but much slower on pypy. Despite this, pypy still remains much faster, regardless of whether it uses regular loops or poems in this algorithm

As Bakuriu notes in comment runtimes, this can make a big difference, but a faster runtime for this algorithm is not necessarily faster for any algorithm ...

+7
source share
3 answers

Try running it with PyPy instead of CPython. Most likely, this will happen much faster.

+7
source

I implemented the function in C and PHP. Here is the result:

PHP : 23.977946043015 sec
Python : 19.31 seconds

C : 0.4 sec
Java : 0.42 s

We learn a language with a system of various types. PHP and Python are dynamically typed, while C and Java are statically typed.

So, the PHP and Python interpreter spends a lot of time guessing the type of variables used and, therefore, is very slow. While in C and Java the type of variables (and array elements) is a static integer ie, and therefore the guessing time is preserved. And, apparently, this guessing time is too long, as you can see from the above numbers.

With PyPY, guessing time is significantly reduced since PyPY uses Just In Time (JIT) compilation. This method is very good at guessing the type of variable used and, therefore, you get a performance score.

+3
source

As mentioned in the comments on your starter post, there is no good way to do this much faster (besides PyPy). You can try islice, which will iterate over "a", and not create new lists or ranges, this should be faster.

 from itertools import islice def count(a): cnt = 0 for x, i in enumerate(islice(a,0, None)): for y, j in enumerate(islice(a, x + 1, None)): for k in islice(a, y + x + 2, None): if i + j + k == 0: cnt+=1 return cnt 
+2
source

All Articles