Edit: I realized that there are a lot of optimizations on SO, but they are rarely ever explained by others for a simple sieve algorithm, so they make it difficult for beginners or creators of the first algorithm to approach. All solutions will be in python to be on the same page for speed and optimization. These decisions will gradually become faster and more difficult. :)
Vanilla solution
def primesVanilla(n): r = [True] * n r[0] = r[1] = False for i in xrange(n): if r[i]: for j in xrange(i+i,n,i): r[j]=False return r
This is a very simple sieve implementation. Please make sure that you understand what is happening above before continuing. The only thing to note is that you start to mark non-prime numbers in i + i instead of i, but this is pretty obvious. (Because you assume that I myself am simple). To make the tests honest, all numbers will be up to 25 million for the list.
real 0m7.663s user 0m7.624s sys 0m0.036s
Small improvement 1 (square roots):
I will try to sort them in terms of a direct transition to less direct changes. Please note that we do not need to iterate over n, but only go to the square root of n. The reason is that any composite number under n must have a prime coefficient equal to or equal to the square root of n. When you sit by hand, you will notice that all “invalid” numbers above the square root of n are prime by default.
One more note: you have to be a little careful when the square root is integer, so you have to add it in this case so that it covers it. IE, with n = 49 you want to loop up to 7 inclusive, or you can conclude that 49 is simple.
def primes1(n): r = [True] * n r[0] = r[1] = False for i in xrange(int(n**0.5+1)): if r[i]: for j in xrange(i+i,n,i): r[j]=False return r real 0m4.615s user 0m4.572s sys 0m0.040s
Please note that this is pretty fast. When you think about it, you only loop to the square root, so now it will take 25 million top-level iterations to only 5,000 upper levels.
Minor improvement 2 (Skip in the inner loop):
Note that in the inner loop, instead of the beginning of i + i, we can start with i * i. This follows from a very similar argument with respect to the square root, but the big idea is that any composites between me and I * I were already marked with lesser touches.
def primes2(n): r = [True] * n r[0] = r[1] = False for i in xrange(int(n**0.5+1)): if r[i]: for j in xrange(i*i,n,i): r[j]=False return r real 0m4.559s user 0m4.500s sys 0m0.056s
Good thing a little disappointing. But hey, it's even faster.
Some significant improvement 3 (even skipping):
The idea here is that we can present all even indices and then skip iterations on 2 in the main loop. After that, we can start the outer loop at 3, and the inner loop can skip instead of 2 * i. (Since switching to i instead means that it will be even, (i + i) (i + i + i + i), etc.)
def primes3(n): r = [True] * n r[0] = r[1] = False for i in xrange(4,n,2):r[i]=False for i in xrange(3,int(n**0.5+1),2): if r[i]: for j in xrange(i*i,n,2*i): r[j]=False return r real 0m2.916s user 0m2.872s sys 0m0.040s
Cool improvements 4 (wim idea):
This solution is a pretty advanced trick. The purpose of the slice is faster than the cycle, so in this case the notation of the python fragment is used. r [start: end: skip]
def primes4(n): r = [True] * n r[0] = r[1] = False r[4::2] = [False] * len(r[4::2]) for i in xrange(3,int(1 + n**0.5),2): if r[i]: r[i*i::2*i] = [False] * len(r[i*i::2*i]) return r 10 loops, best of 3: 1.1 sec per loop
Minor improvement 5
Note that python resells r [4 :: 2] when calculating the length, so it takes quite a bit of extra time, since all we need is a length calculation. To do this, we use some nasty math.
def primes5(n): r = [True] * n r[0] = r[1] = False r[4::2] = [False] * ((n+1)/2-2) for i in xrange(3,int(1 + n**0.5),2): if r[i]: r[i*i::2*i] = [False] * ((n+2*i-1-i*i)/(2*i)) return r 10 loops, best of 3: 767 msec per loop
Assign acceleration (Padraic Cunningham) Notice that we assign an array with all True, and then set half (evens) to False. In fact, we can start with a boolean array that alternates.
def primes6(n): r = [False,True] * (n//2)+[True] r[1],r[2]=False, True for i in xrange(3,int(1 + n**0.5),2): if r[i]: r[i*i::2*i] = [False] * ((n+2*i-1-i*i)/(2*i)) return r 10 loops, best of 3: 717 msec per loop
Don't quote me on this, but I think that without any nasty mathematical methods there are no obvious improvements to this latest version. One of the nice features that I tried, but did not turn out to be faster, notes that primes other than 2,3 should look like 6k + 1 or 6k-1. (Note that if it is 6k, then divisible by 6, 6k + 2 | 2, 6k + 3 | 3, 6k + 4 | 2, 6k + 5, compare with -1 mod 6. This suggests that we can skip 6 each time and check both sides, either due to a poor implementation on my side or from inside python elements I couldn’t find any significant speed increase .:(