Benchmarking in Python: Why is my code running slower with repetition?

Question

Benchmarking in Python: Why is my code running slower with repetition?

I have a simple implementation of Sieve of Eratosthanes :

# Generate all primes less than k def sieve(k): s = [True] * k s[0] = s[1] = False for i in range(4, k, 2): s[i] = False for i in range(3, int(sqrt(k)) + 2, 2): if s[i]: for j in range(i ** 2, k, i * 2): s[j] = False return [2] + [ i for i in range(3, k, 2) if s[i] ]

I compare this code by repeatedly generating primes up to 10 M:

 st = time() for x in range(1000): rt = time() sieve(10000000) print "%3d %.2f %.2f" % (x, time() - rt, (time() - st) / (x + 1))

I got confused, since the time spent on a test run is noticeably increasing:

 run t avg 0 1.49 1.49 1 1.79 1.66 2 2.23 1.85 3 2.72 2.07 4 2.67 2.20 5 2.87 2.31 6 3.05 2.42 7 3.57 2.56 8 3.38 2.65 9 3.48 2.74 10 3.81 2.84 11 3.75 2.92 12 3.85 2.99 13 4.14 3.07 14 4.02 3.14 15 4.05 3.20 16 4.48 3.28 17 4.41 3.34 18 4.19 3.39 19 4.22 3.43 20 4.65 3.49

However, changing each instance from range to xrange fixes the problem:

 run t avg 0 1.26 1.26 1 1.23 1.28 2 1.24 1.27 3 1.25 1.26 4 1.23 1.26 5 1.23 1.25 6 1.25 1.25 7 1.25 1.25 8 1.23 1.25 9 1.25 1.25 10 1.24 1.25

Why is this so? Is this really all the overhead for the GC? 3 times slows down after 20 runs, it seems a lot ...

+7

performance python benchmarking

verdesmarald Sep 16 '12 at 18:29

source share

1 answer

nneonneo · Answer 1 · 2012-09-16T19:12:54+0000

This is not an answer, but just a collection of organized experiments.

It is really exciting. There seems to be a very dubious issue with Python memory allocation.

Here is my attempt to reduce the test file:

 def sieve(k): s = [True] * k for i in xrange(3, int(sqrt(k)) + 2, 2): for j in range(i ** 2, k, i * 2): s[j] = False return [ i for i in range(3, k, 2) if s[i] ] st = time() for x in range(1000): rt = time() sieve(10000000) print "%3d %.2f %.2f" % (x, time() - rt, (time() - st) / (x + 1))

Note that if I remove if s[i] , make the inner range a xrange , make the return value a a generator, or pass in the inner for loop (or make it s[j] = True ), the behavior disappears, and times are flat.

Python memory usage is growing steadily as the function starts, and eventually reaches a plateau (at this point, the start time also starts on a plateau, approximately 250% of their initial values).

My hypothesis is that a large number of internal range (decreasing size) plus the resulting array causes some fragmentation of the worst-case heap, which makes it very difficult to continue selecting objects.

My recommendation would be to make a smaller test case and write it down as a bug with the Python developers (bugs.python.org).

Benchmarking in Python: Why is my code running slower with repetition?

More articles: