Separation Function Optimization

Question

Separation Function Optimization

Here is the code in python:

# function for pentagonal numbers def pent (n): return int((0.5*n)*((3*n)-1)) # function for generalized pentagonal numbers def gen_pent (n): return pent(int(((-1)**(n+1))*(round((n+1)/2)))) # array for storing partitions - first ten already stored partitions = [1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42] # function to generate partitions def partition (k): if (k < len(partitions)): return partitions[k] total, sign, i = 0, 1, 1 while (k - gen_pent(i)) >= 0: sign = (-1)**(int((i-1)/2)) total += sign*(partition(k - gen_pent(i))) i += 1 partitions.insert(k,total) return total

It uses this method to calculate partitions:

 p(k) = p(k − 1) + p(k − 2) − p(k − 5) − p(k − 7) + p(k − 12) + p(k − 15) ...

(source: Wikipedia )

However, the code takes quite a while when it comes to large numbers (over p (10 ^ 3)). I want to ask you if I can optimize my code or hint at another, but faster algorithm or approach. Any suggestions for optimization are welcome.

And no, before you ask, this is not for homework or Project Euler, it is for the value of experience.

+7

optimization python partitioning

Khaled Jul 2 '10 at 8:49

source share

1 answer

Gary kerr · Accepted Answer · 2010-07-02T12:00:56+0000

Here are some comments. Please note that I am not an expert on this subject, I also like to communicate with mathematics (and Project Euler).

I redefined the functions of the pentagon as follows:

 def pent_new(n): return (n*(3*n - 1))/2 def gen_pent_new(n): if n%2: i = (n + 1)/2 else: i = -n/2 return pent_new(i)

I wrote them in such a way as not to introduce floating point calculations - errors will be introduced for large n using float (compare the results for n = 100000001 ).

You can use the timeit module to check for small pieces of code. When I checked all the pentagonal functions (yours and mine), the results for mine were faster. Below is an example that tests your gen_pent function.

 # Add this code to your script t = Timer("for i in xrange(1, 100): gen_pent(i)", "from __main__ import gen_pent") print t.timeit()

Here is a small modification to your partition function. Again, testing with timeit shows that it is faster than your partition function. However, this may be due to an improvement in the functions of the pentagon.

 def partition_new(n): try: return partitions_new[n] except IndexError: total, sign, i = 0, 1, 1 k = gen_pent_new(i) while n - k >= 0: total += sign*partition_new(n - k) i += 1 if i%2: sign *= -1 k = gen_pent_new(i) partitions_new.insert(n, total) return total

In your aggregate function, you calculate the total pentagonal number two times for each cycle. One time to check while while, and another to update total . I save the result in a variable, so I do the calculation only once per cycle.

Using the cProfile module (for python> = 2.5, otherwise the profile module) you can see where your code spends most of its time. Here is an example based on your statistics.

 import cProfile import pstats cProfile.run('partition(70)', 'partition.test') p = pstats.Stats('partition.test') p.sort_stats('name') p.print_stats()

The following result appeared in the console window:

 C:\Documents and Settings\ags97128\Desktop>test.py Fri Jul 02 12:42:15 2010 partition.test 4652 function calls (4101 primitive calls) in 0.015 CPU seconds Ordered by: function name ncalls tottime percall cumtime percall filename:lineno(function) 552 0.001 0.000 0.001 0.000 {len} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 60 0.000 0.000 0.000 0.000 {method 'insert' of 'list' objects} 1 0.000 0.000 0.015 0.015 <string>:1(<module>) 1162 0.002 0.000 0.002 0.000 {round} 1162 0.006 0.000 0.009 0.000 C:\Documents and Settings\ags97128\Desktop\test.py:11(gen_pent) 552/1 0.005 0.000 0.015 0.015 C:\Documents and Settings\ags97128\Desktop\test.py:26(partition) 1162 0.002 0.000 0.002 0.000 C:\Documents and Settings\ags97128\Desktop\test.py:5(pent)

Now profile my section function:

 Fri Jul 02 12:50:10 2010 partition.test 1836 function calls (1285 primitive calls) in 0.006 CPU seconds Ordered by: function name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 60 0.000 0.000 0.000 0.000 {method 'insert' of 'list' objects} 1 0.000 0.000 0.006 0.006 <string>:1(<module>) 611 0.002 0.000 0.003 0.000 C:\Documents and Settings\ags97128\Desktop\test.py:14(gen_pent_new) 552/1 0.003 0.000 0.006 0.006 C:\Documents and Settings\ags97128\Desktop\test.py:40(partition_new) 611 0.001 0.000 0.001 0.000 C:\Documents and Settings\ags97128\Desktop\test.py:8(pent_new)

You can see that in my results there are no calls to the built-in len and round functions. And I almost halved the number of calls to pentagonal functions ( gen_pent_new and pent_new )

There are probably other tricks to improving python code speed. I would search here for "python optimization" to find out what you can find.

Finally, one of the links at the bottom of the wikipedia page that you mentioned is an academic article on fast algorithms for calculating partition numbers. A quick glance shows that it contains pseudo-code for the algorithms. These algorithms are likely to be faster than anything you or I could produce.

Separation Function Optimization

More articles: