Sieve of Eratosthenes - Python Prime Search

Just to clarify, this is not a homework problem :)

I wanted to find prime numbers for the mathematical application that I am creating and came across a Sieve of Eratosthenes .

I wrote an implementation in Python. But it is terribly slow. For example, if I want to find all primes less than 2 million. It takes> 20 minutes. (I stopped him at that moment). How can I speed this up?

def primes_sieve(limit): limitn = limit+1 primes = range(2, limitn) for i in primes: factors = range(i, limitn, i) for f in factors[1:]: if f in primes: primes.remove(f) return primes print primes_sieve(2000) 

UPDATE: I ended up profiling this code and found that quite a lot of time was spent removing an item from the list. It’s clear that you need to iterate over the entire list (the worst case) to find the item, then delete it, and then adjust the list (maybe some kind of copy continues?). Anyway, I chose a list for the dictionary. My new implementation is

 def primes_sieve1(limit): limitn = limit+1 primes = dict() for i in range(2, limitn): primes[i] = True for i in primes: factors = range(i,limitn, i) for f in factors[1:]: primes[f] = False return [i for i in primes if primes[i]==True] print primes_sieve1(2000000) 
+63
python math primes sieve-of-eratosthenes
Oct. 15 '10 at 5:16
source share
15 answers

You do not quite implement the correct algorithm:

In your first example, primes_sieve does not support the list of primary flags for deletion / reset (as in the algorithm), but instead constantly resizes the list of integers, which is very expensive: removing an element from the list requires shifting all subsequent items down by one.

In the second example, primes_sieve1 maintains a dictionary of primary flags, which is a step in the right direction, but it iterates over the dictionary in an undefined order and redundantly selects factors factors (and not just prime factors, as in the algorithm). You can fix this by sorting the keys and skipping hard numbers (which already makes it an order of magnitude faster), but it’s much more efficient to just use the list directly.

The correct algorithm (with a list instead of a dictionary) looks something like this:

 def primes_sieve2(limit): a = [True] * limit # Initialize the primality list a[0] = a[1] = False for (i, isprime) in enumerate(a): if isprime: yield i for n in range(i*i, limit, i): # Mark factors non-prime a[n] = False 

(Note that this also includes algorithmic optimization for starting a complex marking in a simple square ( i*i ) instead of its double.)

+101
Oct. 15 '10 at 11:54
source share
 def eratosthenes(n): multiples = [] for i in range(2, n+1): if i not in multiples: print (i) for j in range(i*i, n+1, i): multiples.append(j) eratosthenes(100) 
+10
Jan 04 '14 at 18:01
source share

Removing from the beginning of an array (list) requires moving all the elements after it. This means that removing each item from the list in this way, starting from the front, is an O (n ^ 2) operation.

You can do it much more efficiently with sets:

 def primes_sieve(limit): limitn = limit+1 not_prime = set() primes = [] for i in range(2, limitn): if i in not_prime: continue for f in range(i*2, limitn, i): not_prime.add(f) primes.append(i) return primes print primes_sieve(1000000) 

... or, alternatively, avoid reordering the list:

 def primes_sieve(limit): limitn = limit+1 not_prime = [False] * limitn primes = [] for i in range(2, limitn): if not_prime[i]: continue for f in xrange(i*2, limitn, i): not_prime[f] = True primes.append(i) return primes 
+6
Oct. 15 '10 at 5:27
source share

Much faster:

 import time def get_primes(n): m = n+1 #numbers = [True for i in range(m)] numbers = [True] * m #EDIT: faster for i in range(2, int(n**0.5 + 1)): if numbers[i]: for j in range(i*i, m, i): numbers[j] = False primes = [] for i in range(2, m): if numbers[i]: primes.append(i) return primes start = time.time() primes = get_primes(10000) print(time.time() - start) print(get_primes(100)) 
+2
Aug 14 '15 at 14:07
source share

I understand that this really does not answer the question of how to quickly generate prime numbers, but maybe some of them will find this alternative interesting: since python provides a lazy evaluation through generators, the eratosthenes sieve can be implemented exactly as indicated:

 def intsfrom(n): while True: yield n n += 1 def sieve(ilist): p = next(ilist) yield p for q in sieve(n for n in ilist if n%p != 0): yield q try: for p in sieve(intsfrom(2)): print p, print '' except RuntimeError as e: print e 

The try block exists, because the algorithm works until it hits the stack and without trying to block the reverse trace by clicking the actual output that you want to see on the screen.

+1
Jul 20 '14 at 10:38
source share

By combining contributions from many enthusiasts (including Glenn Maynard and MrHIDEn from the comments above), I came up with the following code snippet in python 2:

 def simpleSieve(sieveSize): #creating Sieve. sieve = [True] * (sieveSize+1) # 0 and 1 are not considered prime. sieve[0] = False sieve[1] = False for i in xrange(2,int(math.sqrt(sieveSize))+1): if sieve[i] == False: continue for pointer in xrange(i**2, sieveSize+1, i): sieve[pointer] = False # Sieve is left with prime numbers == True primes = [] for i in xrange(sieveSize+1): if sieve[i] == True: primes.append(i) return primes sieveSize = input() primes = simpleSieve(sieveSize) 

The time taken to calculate on my machine for different inputs with a capacity of 10 is:

  • 3: 0.3 ms
  • 4: 2.4 ms
  • 5: 23 ms
  • 6: 0.26 s
  • 7: 3.1 s
  • 8:33 s
+1
Sep 04 '16 at 2:03
source share

Simple hacking speed: when you define the variable "primes", set step 2 to automatically skip all even numbers, and set the starting point to 1.

Then you can further optimize instead of i for primes, use for i in primes [: round (len (primes) ** 0.5)]. This will significantly increase productivity. In addition, you can exclude numbers ending in 5 to further increase speed.

0
Feb 01 '15 at 21:07
source share

Using the filter method to sift a list of numbers, you can do the following.

 from math import sqrt def eratosthenes(limit): lst = range(1, limit) for i in range(2, int(sqrt(limit)) + 1): lst = filter(lambda x: x == i or x % i, lst) # sieve return lst print eratosthenes(2000000) 
0
Mar 10 '16 at 12:47
source share

My implementation:

 import math n = 100 marked = {} for i in range(2, int(math.sqrt(n))): if not marked.get(i): for x in range(i * i, n, i): marked[x] = True for i in range(2, n): if not marked.get(i): print i 
0
May 04 '16 at 4:51
source share

Here's a slightly more memory efficient version (and: the correct sieve, not trial units). In principle, instead of storing an array of all numbers and deleting those that are not primary, it stores an array of counters - one for each stroke found - and skips them in front of the proposed prime number. Thus, it uses storage proportional to the number of primes, and not to the highest value.

 import itertools def primes(): class counter: def __init__ (this, n): this.n, this.current, this.isVirgin = n, n*n, True # isVirgin means it never been incremented def advancePast (this, n): # return true if the counter advanced if this.current > n: if this.isVirgin: raise StopIteration # if this is virgin, then so will be all the subsequent counters. Don't need to iterate further. return False this.current += this.n # pre: this.current == n; post: this.current > n. this.isVirgin = False # when it gone, it gone return True yield 1 multiples = [] for n in itertools.count(2): isPrime = True for p in (m.advancePast(n) for m in multiples): if p: isPrime = False if isPrime: yield n multiples.append (counter (n)) 

You will notice that primes() is a generator, so you can save the results in a list or use them directly. Here are the first n prime numbers:

 import itertools for k in itertools.islice (primes(), n): print (k) 

And, for completeness, here is a timer for measuring performance:

 import time def timer (): t, k = time.process_time(), 10 for p in primes(): if p>k: print (time.process_time()-t, " to ", p, "\n") k *= 10 if k>100000: return 

Just in case you are interested, I also wrote primes() as a simple iterator (using __iter__ and __next__ ), and it worked at almost the same speed. They surprised me too!

0
Jan 05 '17 at 19:55
source share

I prefer NumPy because of speed.

 import numpy as np # Find all prime numbers using Sieve of Eratosthenes def get_primes1(n): m = int(np.sqrt(n)) is_prime = np.ones(n, dtype=bool) is_prime[:2] = False # 0 and 1 are not primes for i in range(2, m): if is_prime[i] == False: continue is_prime[i*i::i] = False return np.nonzero(is_prime)[0] # Find all prime numbers using brute-force. def isprime(n): ''' Check if integer n is a prime ''' n = abs(int(n)) # n is a positive integer if n < 2: # 0 and 1 are not primes return False if n == 2: # 2 is the only even prime number return True if not n & 1: # all other even numbers are not primes return False # Range starts with 3 and only needs to go up the square root # of n for all odd numbers for x in range(3, int(n**0.5)+1, 2): if n % x == 0: return False return True # To apply a function to a numpy array, one have to vectorize the function def get_primes2(n): vectorized_isprime = np.vectorize(isprime) a = np.arange(n) return a[vectorized_isprime(a)] 

Check the output:

 n = 100 print(get_primes1(n)) print(get_primes2(n)) [ 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97] [ 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97] 

Compare the speed of the eratosthenes sieves and brute force on a Jupyter laptop. The sieve of Eratosthenes is 539 times faster than brute force for millions of elements.

 %timeit get_primes1(1000000) %timeit get_primes2(1000000) 4.79 ms Β± 90.3 Β΅s per loop (mean Β± std. dev. of 7 runs, 100 loops each) 2.58 s Β± 31.2 ms per loop (mean Β± std. dev. of 7 runs, 1 loop each) 
0
Sep 14 '17 at 10:18
source share

I thought that you could just use an empty list as the final condition for the loop and came up with this:

 limit = 100 ints = list(range(2, limit)) # Will end up empty while len(ints) > 0: prime = ints[0] print prime ints.remove(prime) i = 2 multiple = prime * i while multiple <= limit: if multiple in ints: ints.remove(multiple) i += 1 multiple = prime * i 
0
Mar 02 '18 at 7:24
source share
 import math def sieve(n): primes = [True]*n primes[0] = False primes[1] = False for i in range(2,int(math.sqrt(n))+1): j = i*i while j < n: primes[j] = False j = j+i return [x for x in range(n) if primes[x] == True] 
0
Sep 26 '18 at 4:26
source share

My fastest implementation:

 isprime = [True]*N isprime[0] = isprime[1] = False for i in range(4, N, 2): isprime[i] = False for i in range(3, N, 2): if isprime[i]: for j in range(i*i, N, 2*i): isprime[j] = False 
0
May 04 '19 at 21:44
source share

I think this is the shortest code for finding primes using the Eratosthenes method

 def prime(r): n = range(2,r) while len(n)>0: yield n[0] n = [x for x in n if x not in range(n[0],r,n[0])] print(list(prime(r))) 
0
May 6 '19 at 1:22
source share



All Articles