Binomial coefficient

Question

Binomial coefficient

A “simple” question, what is the fastest way to calculate the binomial coefficient? - Some multithreaded algorithm?

I'm looking for clues :) - not an implementation :)

+4

multithreading algorithm complexity-theory binomial-coefficients

Skeen Nov 23 '10 at 12:49

source share

4 answers

Well, the fastest way, I suppose, would be to read them from a table, rather than compute them.

Your requirements for integer precision when using double representation means that C (60.30) is too big, is about 1e17, so (if you want to have C (m, n) for all m to some limit and all n <= m ), your table will only contain about 1800 records. As for filling the table, I think the Pascal triangle is the way to go.

+4

dmuir Nov 23 '10 at 14:31

source share

Hint: you want to make as few multiplications as possible. Formula n! / (k! * (nk)!) n! / (k! * (nk)!) . You must do less than 2m multiplications, where m is the minimum of k and nk . If you want to work with (fair) large numbers, you must use a special class to represent numbers (for example, Java has BigInteger).

+3

Jordi Nov 23 '10 at 13:11

source share

Here is a method that never overflows if the end result is expressed initially in the machine, does not include multiplication / decomposition, is easily parallelized and generalized to BigInteger types:

First of all, we note that binomial coefficients satisfy the following:

.

This gives direct recursion to calculate the coefficient: basic cases and , both of which are equal to 1.

The individual results from the subframes are integers, and if \ binom {n} {k} can be represented int, they can too; therefore overflow is not a problem.

Naively implemented, recursion leads to repeated sub-calls and exponential runtimes.

This can be fixed by caching intermediate results. There are n ^ 2 subtasks that can be combined O (1) times, which gives an estimate of the complexity of O (n ^ 2).

+1

user486972 Nov 23 '10 at 19:50

source share

Dialecticus · Accepted Answer · 2010-11-23T13:29:58+0000

According to the equation below (from wikipedia ), the fastest way would be to split the range i = 1, k into the number of threads, give each thread one segment of the range, and each thread updates the final result in a lock. The “academic way” would be to break down the range into tasks, each task consists in calculating (n - k + i) / i, and then no matter how many threads you have, they all start in a loop, asking for the next task. At first it’s faster, the second is academic.

alt text

EDIT: further explanation - in both cases we have a certain number of threads. Typically, the number of threads is equal to the number of processor cores, because there is no benefit in adding more threads. The difference between the two methods is what these threads do.

In the first case, each thread is given N, K, I1 and I2, where I1 and I2 are a segment in the range 1..K. Then each stream has all the data that it is not, so it calculates its part of the result and after the finish updates the final result.

In the second case, each thread is assigned N, K and access to some synchronized counter, which is calculated from 1 to K. Each thread then gets one value from this total counter, calculates one part of the result, updates the final result, and loops on it until the counter will not tell the thread that there are no more elements. If it happens that some processor cores are faster than others, then this second method will maximize the use of all cores. The disadvantage of the second method is too much synchronization, which effectively blocks, say, 20% of the threads all the time.

Binomial coefficient

More articles: