Quick Simple Factorization Module

Question

Quick Simple Factorization Module

I'm looking for an implementation or cleanup algorithm to get a simple factorization of N in any python, pseudo-code, or other well readable code. There are several requirements / facts:

N is between 1 and ~ 20 digits
No pre-computed lookup table, memoization is ok.
No need to be mathematically proven (for example, if necessary, can rely on the Goldbach hypothesis)
It is not necessary to be precise, it is allowed to be probabilistic / deterministic, if necessary

I need an algorithm for quick simple factorization, not only for myself, but also for use in many other algorithms, such as calculating Euler phi (n).

I tried other algorithms from Wikipedia and such, but I could not understand them (ECM), or I could not create a working implementation from the algorithm (Pollard-Brent).

I'm really interested in the Pollard-Brent algorithm, so any information / implementations on it would be really nice.

Thank!

EDIT

After a bit of communication, I created a fairly quick prime / factorization module. It combines an optimized trial division algorithm, the Pollard-Brent algorithm, the miller-rabin primitive test and the fastest live broadcast I found on the Internet. gcd is a regular implementation of the Euclidean GCD (binary Euclid GCD is much slower than normal).

Bounty

Oh joy, you can get generosity! But how can I win it?

Look for an optimization or error in my module.
Provide alternative / best algorithms / implementations.

The answer that is most comprehensive / constructive receives a reward.

And finally, the module itself:

import random def primesbelow(N): # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188 #""" Input N>=6, Returns a list of primes, 2 <= p < N """ correction = N % 6 > 1 N = {0:N, 1:N-1, 2:N+4, 3:N+3, 4:N+2, 5:N+1}[N%6] sieve = [True] * (N // 3) sieve[0] = False for i in range(int(N ** .5) // 3 + 1): if sieve[i]: k = (3 * i + 1) | 1 sieve[k*k // 3::2*k] = [False] * ((N//6 - (k*k)//6 - 1)//k + 1) sieve[(k*k + 4*k - 2*k*(i%2)) // 3::2*k] = [False] * ((N // 6 - (k*k + 4*k - 2*k*(i%2))//6 - 1) // k + 1) return [2, 3] + [(3 * i + 1) | 1 for i in range(1, N//3 - correction) if sieve[i]] smallprimeset = set(primesbelow(100000)) _smallprimeset = 100000 def isprime(n, precision=7): # http://en.wikipedia.org/wiki/Miller-Rabin_primality_test#Algorithm_and_running_time if n < 1: raise ValueError("Out of bounds, first argument must be > 0") elif n <= 3: return n >= 2 elif n % 2 == 0: return False elif n < _smallprimeset: return n in smallprimeset d = n - 1 s = 0 while d % 2 == 0: d //= 2 s += 1 for repeat in range(precision): a = random.randrange(2, n - 2) x = pow(a, d, n) if x == 1 or x == n - 1: continue for r in range(s - 1): x = pow(x, 2, n) if x == 1: return False if x == n - 1: break else: return False return True # https://comeoncodeon.wordpress.com/2010/09/18/pollard-rho-brent-integer-factorization/ def pollard_brent(n): if n % 2 == 0: return 2 if n % 3 == 0: return 3 y, c, m = random.randint(1, n-1), random.randint(1, n-1), random.randint(1, n-1) g, r, q = 1, 1, 1 while g == 1: x = y for i in range(r): y = (pow(y, 2, n) + c) % n k = 0 while k < r and g==1: ys = y for i in range(min(m, rk)): y = (pow(y, 2, n) + c) % n q = q * abs(xy) % n g = gcd(q, n) k += m r *= 2 if g == n: while True: ys = (pow(ys, 2, n) + c) % n g = gcd(abs(x - ys), n) if g > 1: break return g smallprimes = primesbelow(1000) # might seem low, but 1000*1000 = 1000000, so this will fully factor every composite < 1000000 def primefactors(n, sort=False): factors = [] for checker in smallprimes: while n % checker == 0: factors.append(checker) n //= checker if checker > n: break if n < 2: return factors while n > 1: if isprime(n): factors.append(n) break factor = pollard_brent(n) # trial division did not fully factor, switch to pollard-brent factors.extend(primefactors(factor)) # recurse to factor the not necessarily prime factor returned by pollard-brent n //= factor if sort: factors.sort() return factors def factorization(n): factors = {} for p1 in primefactors(n): try: factors[p1] += 1 except KeyError: factors[p1] = 1 return factors totients = {} def totient(n): if n == 0: return 1 try: return totients[n] except KeyError: pass tot = 1 for p, exp in factorization(n).items(): tot *= (p - 1) * p ** (exp - 1) totients[n] = tot return tot def gcd(a, b): if a == b: return a while b > 0: a, b = b, a % b return a def lcm(a, b): return abs((a // gcd(a, b)) * b)

+55

python algorithm prime-factoring

orlp Jan 10 2018-11-11T00:

source share

8 answers

If you don't want to reinvent the wheel, use the sympy library

 pip install sympy

Use sympy.ntheory.factorint function

 >>> from sympy.ntheory import factorint >>> factorint(10**20+1) {73: 1, 5964848081: 1, 1676321: 1, 137: 1}

You can specify some very large numbers:

 >>> factorint(10**100+1) {401: 1, 5964848081: 1, 1676321: 1, 1601: 1, 1201: 1, 137: 1, 73: 1, 129694419029057750551385771184564274499075700947656757821537291527196801: 1}

+26

Colonel Panic Aug 13 '15 at 11:03

source share

There is no need to calculate smallprimes using primesbelow , use smallprimeset for this.

smallprimes = (2,) + tuple(n for n in xrange(3,1000,2) if n in smallprimeset)

Divide primefactors into two functions for processing smallprimes and others for pollard_brent , this can save a couple of iterations, since all powers of smallprimes will be divided by n.

 def primefactors(n, sort=False): factors = [] limit = int(n ** .5) + 1 for checker in smallprimes: print smallprimes[-1] if checker > limit: break while n % checker == 0: factors.append(checker) n //= checker if n < 2: return factors else : factors.extend(bigfactors(n,sort)) return factors def bigfactors(n, sort = False): factors = [] while n > 1: if isprime(n): factors.append(n) break factor = pollard_brent(n) factors.extend(bigfactors(factor,sort)) # recurse to factor the not necessarily prime factor returned by pollard-brent n //= factor if sort: factors.sort() return factors

By examining the proven results of Pomerance, Selfridge, and Wagstaff and Yeshke, you can reduce the repetitions in isprime , which uses the Miller-Rabin primitive test. From the Wiki .

if n <1,373,653, it is enough to check a = 2 and 3;
if n <9,080,191, it is enough to check a = 31 and 73;
if n <4,759,123,141, it is enough to check a = 2, 7 and 61;
if n <2,152,302,898,747, it is enough to check a = 2, 3, 5, 7 and 11;
if n <3,474,749,660,383, it is enough to check a = 2, 3, 5, 7, 11 and 13;
if n <341,550,071,728,321, it is enough to check a = 2, 3, 5, 7, 11, 13 and 17.

Change 1 . Fixed if-else callback to add bigfactors to factors in primefactors .

+10

Rozuur Jan 19 '11 at 19:33

source share

Even on the current one, there are a few spots that need to be noticed.

Do not run checker*checker every loop, use s=ceil(sqrt(num)) and checher < s
checher should plus 2 every time, ignore all even numbers except 2
Use divmod instead of % and //

+4

Kabie Jan 10 2018-11-11T00:

source share

You probably need to do some basic discovery, which you can see here, Fast algorithm for finding prime numbers?

You should read the entire blog, although there are several algorithms that it lists for validation.

+2

milkypostman Jan 10 2018-11-11T00:

source share

There is a python library with a set of primitiveness tests (including the wrong ones for what not to do). It is called pyprimes . It is clear that the purpose of the offspring is worth mentioning. I do not think that it includes the algorithms you mentioned.

+2

Ehtesh Choudhury Oct 09 '13 at 2:24

source share

I just ran into an error in this code when factorizing the number 2**1427 * 31 .

  File "buckets.py", line 48, in prettyprime factors = primefactors.primefactors(n, sort=True) File "/private/tmp/primefactors.py", line 83, in primefactors limit = int(n ** .5) + 1 OverflowError: long int too large to convert to float

This piece of code:

 limit = int(n ** .5) + 1 for checker in smallprimes: if checker > limit: break while n % checker == 0: factors.append(checker) n //= checker limit = int(n ** .5) + 1 if checker > limit: break

should be rewritten in

 for checker in smallprimes: while n % checker == 0: factors.append(checker) n //= checker if checker > n: break

which is likely to run faster on real inputs. The square root is slow - basically the equivalent of many multiplications -, smallprimes has only a few dozen members, and so we remove the calculation of n ** .5 from the hard inner loop, which is certainly useful when factorizing numbers like 2**1427 . There is simply no reason to calculate sqrt(2**1427) , sqrt(2**1426) , sqrt(2**1425) , etc. Etc., When all we care about is “does the [square] controller exceed n ”.

The rewritten code does not generate exceptions for large numbers, and is about two times faster than timeit (on entering samples 2 and 2**718 * 31 ).

Also note that isprime(2) returns an incorrect result, but this is normal if we do not rely on it. IMHO you should rewrite the input of this function as

 if n <= 3: return n >= 2 ...

0

Quuxplusone Oct 27 '14 at 8:29

source share

You can factorize to the limit, and then use brent to get higher odds.

 from fractions import gcd from random import randint def brent(N): if N%2==0: return 2 y,c,m = randint(1, N-1),randint(1, N-1),randint(1, N-1) g,r,q = 1,1,1 while g==1: x = y for i in range(r): y = ((y*y)%N+c)%N k = 0 while (k<r and g==1): ys = y for i in range(min(m,rk)): y = ((y*y)%N+c)%N q = q*(abs(xy))%N g = gcd(q,N) k = k + m r = r*2 if g==N: while True: ys = ((ys*ys)%N+c)%N g = gcd(abs(x-ys),N) if g>1: break return g def factorize(n1): if n1==0: return [] if n1==1: return [1] n=n1 b=[] p=0 mx=1000000 while n % 2 ==0 : b.append(2);n//=2 while n % 3 ==0 : b.append(3);n//=3 i=5 inc=2 while i <=mx: while n % i ==0 : b.append(i); n//=i i+=inc inc=6-inc while n>mx: p1=n while p1!=p: p=p1 p1=brent(p) b.append(p1);n//=p1 if n!=1:b.append(n) return sorted(b) from functools import reduce #n= 2**1427 * 31 # n= 67898771 * 492574361 * 10000223 *305175781* 722222227*880949 *908909 li=factorize(n) print (li) print (n - reduce(lambda x,y :x*y ,li))

0

Antoni Gual Via Apr 20 '15 at 9:30

source share

Amber · Accepted Answer · 2011-01-10 04:27

Pollard-Brent implemented in Python:

https://comeoncodeon.wordpress.com/2010/09/18/pollard-rho-brent-integer-factorization/

Quick Simple Factorization Module

Bounty

More articles: