Finding the lowest sequence of collates that gives more than 65 primes

I have a task where I need to find the lowest Collatz sequence containing more than 65 primes in Python.

For example, the Collatz sequence for 19 is:

19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1

This sequence contains 7 primes.

I also need to use memoization, so it doesn’t need to run “year” to find it. I found code for memoizing Collatz sequences, but I can't figure out how to make it work when I need only prime numbers.

Here is the memoization Collatz code I found:

lookup = {} def countTerms(n): if n not in lookup: if n == 1: lookup[n] = 1 elif not n % 2: lookup[n] = countTerms(n / 2)[0] + 1 else: lookup[n] = countTerms(n*3 + 1)[0] + 1 return lookup[n], n 

And here is my tester for easy:

 def is_prime(a): for i in xrange(2,a): if a%i==0: #print a, " is not a prime number" return False if a==1: return False else: return True 
+5
source share
2 answers

Existing code is indented incorrectly. I believe that this task is a home task, so I will not publish a complete working solution, but I will give you some useful snippets.

Firstly, here is a slightly more effective strength tester. Instead of testing if all numbers less than a are factors of a , he simply checks the square root of a .

 def is_prime(a): for i in xrange(2, int(1 + a ** 0.5)): if a % i == 0: return False return True 

Note that this function returns True for a = 1 . This is normal since you do not need to test 1: you can preload it in a lookup dict:

 lookup = {1:0} 

Your countTerms function countTerms to be slightly modified so that it only adds it to the lookup count when the current n is simple. In Python, False has a numeric value of 0 and True has a numeric value of 1. This is very convenient here:

 def count_prime_terms(n): ''' Count the number of primes terms in a Collatz sequence ''' if n not in lookup: if n % 2: next_n = n * 3 + 1 else: next_n = n // 2 lookup[n] = count_prime_terms(next_n) + is_prime(n) return lookup[n] 

I changed the name of the function to be more Pythonic.

FWIW, the first Collatz sequence containing 65 or more primes, actually contains 67 primes. Its number of seeds exceeds 1.8 million, and the largest number that requires testing of primitives when checking all sequences before this seed is 151629574372. Upon completion, lookup dict contains 3920492 records.


In response to James Mills's comments regarding recursion, I wrote a non-recursive version, and to make it easy to see that the iterative and recursive versions give the same results, I submit the full working program. I said above that I am not going to do this, but I believe that now everything is in order, since spørreren already wrote his program using the information that I presented in my initial answer.

I completely agree that it is useful to avoid recursion, except in situations where it is suitable for a problem area (for example, traversing a tree). Python prevents recursion - it cannot optimize tail call recursion and imposes a restriction on the recursion depth (although this limit can be changed if necessary).

This simple Collatz sequence counting algorithm is naturally recursively recursive, but it's not so difficult to do this - we just need a list to temporarily hold the sequence while the sequence of all its members is determined. True, this list takes up RAM, but it is (possibly) much more efficient in size than the frame frame requirements necessary for the recursive version.

The recursive version reaches 343 recursion depth when solving a problem in the OP. This is good in the default limit, but it is still not very good, and if you want to search for sequences containing a much larger number of primes, you will fall into this limit.

Iterative and recursive versions work at about the same speed (at least they do this on my machine). To solve the problem indicated in the OP, they take a little less than 2 minutes. This is significantly faster than my original solution, mainly due to the optimization of testing primitives.

The main step in generating the Collatz sequence is to determine whether the number is odd or even. It is clear that if we already know that the number is equal, then there is no need to check whether it is prime. :) And we can also exclude tests for even factors in the is_prime function. We can handle the fact that 2 is simple by simply loading the result for 2 into cache << 26>.

In the corresponding note, when searching for the first sequence containing a given number of primes, we do not need to check any of the sequences starting with an even number. Even numbers (except 2) do not increase the prime number of the sequence, and since the first odd number in such a sequence will be less than our current number, its results will already be in the lookup cache if we start our search with 3. And if we do not start the search with 3, we just need to make sure that the initial seed is low enough so that we do not accidentally miss the first sequence containing the desired number of primes. Adopting this strategy not only reduces time, but also reduces the number of entries in the search cache.

 #!/usr/bin/env python ''' Find the 1st Collatz sequence containing a given number of prime terms From http://stackoverflow.com/q/29920691/4014959 Written by PM 2Ring 2015.04.29 [Seed == 1805311, prime count == 67] ''' import sys def is_prime(a): ''' Test if odd `a` >= 3 is prime ''' for i in xrange(3, int(1 + a ** 0.5), 2): if not a % i: return 0 return 1 #Track the highest number generated so far; use a list # so we don't have to declare it as global... hi = [2] #Cache for sequence prime counts. The key is the sequence seed, # the value is the number of primes in that sequence. lookup = {1:0, 2:1} def count_prime_terms_iterative(n): ''' Count the number of primes terms in a Collatz sequence Iterative version ''' seq = [] while n not in lookup: if n > hi[0]: hi[0] = n if n % 2: seq.append((n, is_prime(n))) n = n * 3 + 1 else: seq.append((n, 0)) n = n // 2 count = lookup[n] for n, isprime in reversed(seq): count += isprime lookup[n] = count return count def count_prime_terms_recursive(n): ''' Count the number of primes terms in a Collatz sequence Recursive version ''' if n not in lookup: if n > hi[0]: hi[0] = n if n % 2: next_n = n * 3 + 1 isprime = is_prime(n) else: next_n = n // 2 isprime = 0 lookup[n] = count_prime_terms(next_n) + isprime return lookup[n] def find_seed(numprimes, start): ''' Find the seed of the 1st Collatz sequence containing `numprimes` primes, starting from odd seed `start` ''' i = start mcount = 0 print 'seed, prime count, highest term, dict size' while mcount < numprimes: count = count_prime_terms(i) if count > mcount: mcount = count print i, count, hi[0], len(lookup) i += 2 #count_prime_terms = count_prime_terms_recursive count_prime_terms = count_prime_terms_iterative def main(): if len(sys.argv) > 1: numprimes = int(sys.argv[1]) else: print 'Usage: %s numprimes [start]' % sys.argv[0] exit() start = int(sys.argv[2]) if len(sys.argv) > 2 else 3 #Round `start` up to the next odd number if start % 2 == 0: start += 1 find_seed(numprimes, start) if __name__ == '__main__': main() 

When starting from

$ ./CollatzPrimes.py 65

conclusion

 seed, prime count, highest term, dict size 3 3 16 8 7 6 52 18 19 7 160 35 27 25 9232 136 97 26 9232 230 171 28 9232 354 231 29 9232 459 487 30 39364 933 763 32 250504 1626 1071 36 250504 2197 4011 37 1276936 8009 6171 43 8153620 12297 10971 44 27114424 21969 17647 48 27114424 35232 47059 50 121012864 94058 99151 51 1570824736 198927 117511 52 2482111348 235686 202471 53 17202377752 405273 260847 55 17202377752 522704 481959 59 24648077896 966011 963919 61 56991483520 1929199 1564063 62 151629574372 3136009 1805311 67 151629574372 3619607 
+3
source

Let's start with a function that determines if a number is prime; we will use the Miller-Rabin algorithm, which is faster than trial division, for the size of the numbers with which we will deal:

 from random import randint def isPrime(n, k=5): # miller-rabin if n < 2: return False for p in [2,3,5,7,11,13,17,19,23,29]: if n % p == 0: return n == p s, d = 0, n-1 while d % 2 == 0: s, d = s+1, d/2 for i in range(k): x = pow(randint(2, n-1), d, n) if x == 1 or x == n-1: continue for r in range(1, s): x = (x * x) % n if x == 1: return False if x == n-1: break else: return False return True 

Since we want to find the first number that satisfies the criteria, our strategy will be to start with 2 and work on us, preserving the results as we progress. We fill the cache (which is a bad pun, sorry) with the first numbers in Collatz sequences from 0 to 2:

 primeCount = [0,0,1] 

The pCount(n) function counts the primes in the Collatz sequence for n. As soon as the current value of the sequence k falls below n, we look at the result in the cache. Until then, we check every odd number in the Collatz sequence for primitiveness and, if necessary, increase the number p. When we have a primary number for n, we add it to the cache and return it.

 def pCount(n): k, p = n, 0 while k > 0: if k < n: t = p + primeCount[k] primeCount.append(t) return t elif k % 2 == 0: k = k / 2 elif isPrime(k): p = p + 1 k = 3*k + 1 else: k = 3*k + 1 

Now it's just a matter of calculating simple counts for each n, starting at 3, stopping when the primary count is over 65:

 n = 3 t = pCount(n) while t < 65: n = n + 1 t = pCount(n) 

It does not take much time, less than a minute on my computer. Here is the result:

 print n 

The result is 67 primes. If you want to see them, here is a simple function that prints a Collatz sequence for a given n:

 def collatz(n): cs = [] while n != 1: cs.append(n) n = 3*n+1 if n&1 else n//2 cs.append(1) return cs 

And here is a list of primes:

 filter(isPrime,collatz(n)) 

What a fun problem recreational math!

EDIT:. Since people asked about the Miller-Rabin primitive test, let me show you this simple strength tester based on a 2,3,5-wheel; it performs trial divisions by 2, 3, and 5 and numbers that are not multiples of 2, 3, or 5, including some composites, so it’s a little less efficient than trial divisions by primes, but there’s no need to pre-calculate and store primes, therefore it is much easier to use.

 def isPrime(n): # 2,3,5-wheel if n < 2: return False wheel = [1,2,2,4,2,4,2,4,6,2,6] w, next = 2, 0 while w * w <= n: if n % w == 0: return False w = w + wheel[next] next = next + 1 if next > 10: next = 3 return True 

The statement filter(isPrime,range(1000000)) identifies 78,498 primes less than a million in a few seconds. You might want to compare the timings for this strength tester versus the Miller-Rabin tester and determine where the intersection point is in terms of execution efficiency; I did not do any formal timings, but it seems to have solved the 65-collatz-prime problem at about the same time as the Miller-Rabin tester. Or you can combine a trial unit with a 2,3,5-wheel to a certain limit, say 1,000 or 10,000 or 25,000, and then run Miller-Rabin on the survivors; which quickly eliminates most composites, so it can work very quickly.

+1
source

All Articles