OR-multiplication by large integers

The multiplication of two n-bit numbers A and B can be understood as the sum of the shifts:

(A << i1) + (A << i2) + ... 

where i1, i2, ... are the number of bits that are set to 1 in B.

Now replace PLUS with OR to get the new operation that I really need:

  (A << i1) | (A << i2) | ... 

This operation is very similar to regular multiplication, for which there are many faster algorithms (e.g. Schönhage-Strassen). Is this a similar algorithm to work presented here?

The size of the numbers is 6000 bits.

edit For some reason, I don’t have a link / button to post comments (any idea why?), so I will edit my insead question. I am really looking for a faster O (n ^ 2) algorithm for the operation defined above. And yes, I know that this is not ordinary multiplication.

+4
source share
5 answers

Is there a similar algorithm? I think probably not.

Is there a way to speed up work on O (n ^ 2)? Maybe. If you think that the number A is analogous to A (x) = & Sigma; a n x n where a n are binary digits from A, then your bitwise OR operation (let it be called A & oplus; B) can be expressed as follows, where "& hArr;" means analog

A & hArr; A (x) =? Sigma; a n x n

B & hArr; B (x) =? Sigma; b n x n

C = A? B & hArr; C (x) = f (A (x) B (x)) = f (V (x)) where f (V (x)) = f (& Sigma; v n x n ) = & Sigma; u (v n ) x n where u (v n ) = 0 if v n = 0, u (v n ) = 1 otherwise.

Basically, you make the equivalent of taking two polynomials and multiply them together, and then identify all nonzero terms. From the point of view of the bit string, this means processing the bit string as an array of samples of zeros or ones, folding two arrays and folding that are non-zero. There are fast convolution algorithms that represent O (n log n), using, for example, FFT, and the “collapse” step here is O (n) ... but for some reason it is interesting if the estimate is O (n log n) fast convolution handles something (like multiplying large integers) like O (1), so in reality you won’t get a faster algorithm. Either that, or the constants for growth orders are so large that you have to have thousands of bits before you get any speed advantage. ORing is so simple.

edit: something like “binary convolution” appears (see this book ), which sounds terribly relevant here, but I cannot find any good references to the theory behind it, and are there any fast algorithms.

edit 2: maybe the term “logical convolution” or “bitwise convolution” ... here a page from CPAN (bleah!) talks a little about it along with the Walsh and Hadamard transforms, which are bitwise Fourier transforms ... hmm, no, this is similar to the counterpart for XOR, not OR.

+1
source

I suppose you are asking for a name for the additive technique that you indicated when you write, “Is this a similar algorithm for the operation presented here?” ...

Have you considered the method of "Peasant Multiplication"?

Please read the Wikipedia description if you do not get the third column in this example.

  BXA 27 X 15 : 1 13 30 : 1 6 60 : 0 3 120 : 1 1 240 : 1 B is 27 == binary form 11011b 27x15 = 15 + 30 + 120 + 240 = 15<<0 + 15<<1 + 15<<3 + 15<<4 = 405 

Sounds familiar?


Here is your algorithm.

  • Choose a lower number as A
    • Initialize C as a result area
    • as long as B is not zero,
      • if lsb of B is 1 , add A to C
    • left shift A once
    • shift right B once
    • C has your multiplication result (unless you flipped sizeof C )

Update If you are trying to get a fast algorithm for the shift and OR operation after 6000 bits,
there can actually be one. I will think a little more about this.

It will look like “blurring” one number over another. Interesting.
Here's a pretty crude example,

 110000011 X 1010101 would look like 110000011 110000011 110000011 110000011 --------------- 111111111111111 

The number 1 in two numbers will determine the amount of blur in relation to the number with all its bits.
I wonder what you want to do with it ...


Update2 This is the nature of the shift + OR operation with two 6000-bit numbers.

  • The result will be 12,000 bits, of course
    • the operation can be performed using two bit streams; but do not need to do in full
    • the "middle" part of a 12000-bit stream will almost certainly be 1 s (if both numbers are non-zero)
    • the problem is determining the depth with which we need to process this operation to get both ends of the stream with 12000 bits.
    • the pattern at both ends of the stream will depend on the largest consecutive 1 s present in both numbers

I haven't gotten to a clean algorithm yet. Updated for anyone who wants to double-check or go further from here. In addition, a description of the need for such an operation may motivate further interest :-)

0
source

You can do this O (# 1-bit in * # 1-bit in B).

 a-bitnums = set(x : ((1<<x) & A) != 0) b-bitnums = set(x : ((1<<x) & B) != 0) c-set = 0 for a-bit in a-bitnums: for b-bit in b-bitnums: c-set |= 1 << (a-bit + b-bit) 

This can be useful if A and B are sparse in a 1-bit number.

0
source

The best I could handle was using accelerated loop logic. Combined with the ability to use the Non-Zero approach as described by themis, you can answer your question by checking out less than 2% of the N ^ 2 problem.

Below is the code that gives time for numbers from 80% to 99%. When numbers go around 88% of zero, using themis switches is “better” (however, it was not encoded in the example below).

This is not a very theoretical solution, but it is practical.

OK, here is some “theory” of the problem space:

In principle, each bit for X (output) is the sum of the OR bits on the diagonal of the grid constructed using bits A along the top (from MSB to LSB from left to right) and bits B along (from MSB to LSB from top to bottom). Since bit X is 1, if any of the diagonal is 1, you can perform an early exit to bypass the cell.

The code below shows that even for numbers equal ~ 87% zero, you only need to check ~ 2% of the cells. For more dense (more than 1) numbers, this percentage drops even more.

In other words, I would not worry about complex algorithms and just conduct an effective logic check. I think the trick is to look at the bits of your output as a diagonal of the grid, and not at bits A shift-OR with bits B. The most difficult thing in this case is tracking the bits you can see at at and B and how index bits correctly.

Hope this makes sense. Let me know if I need to explain this a bit more (or if you have any problems with this approach).

NOTE. If we knew your problem space a little better, we could optimize the algorithm accordingly. If your numbers are mostly non-zero, then this approach is better than themis, as the result is more computation and the required storage space (sizeof (int) * NNZ).

NOTE 2: This assumes that the data is mostly bits, and I use the .NET BitArray to store and access the data. I do not think that this can cause serious headaches when translated into other languages. The basic idea is still being applied.

 using System; using System.Collections; namespace BigIntegerOr { class Program { private static Random r = new Random(); private static BitArray WeightedToZeroes(int size, double pctZero, out int nnz) { nnz = 0; BitArray ba = new BitArray(size); for (int i = 0; i < size; i++) { ba[i] = (r.NextDouble() < pctZero) ? false : true; if (ba[i]) nnz++; } return ba; } static void Main(string[] args) { // make sure there are enough bytes to hold the 6000 bits int size = (6000 + 7) / 8; int bits = size * 8; Console.WriteLine("PCT ZERO\tSECONDS\t\tPCT CELLS\tTOTAL CELLS\tNNZ APPROACH"); for (double pctZero = 0.8; pctZero < 1.0; pctZero += 0.01) { // fill the "BigInts" int nnzA, nnzB; BitArray a = WeightedToZeroes(bits, pctZero, out nnzA); BitArray b = WeightedToZeroes(bits, pctZero, out nnzB); // this is the answer "BigInt" that is at most twice the size minus 1 int xSize = bits * 2 - 1; BitArray x = new BitArray(xSize); int LSB, MSB; LSB = MSB = bits - 1; // stats long cells = 0; DateTime start = DateTime.Now; for (int i = 0; i < xSize; i++) { // compare using the diagonals for (int bit = LSB; bit < MSB; bit++) { cells++; x[i] |= (b[MSB - bit] && a[bit]); if (x[i]) break; } // update the window over the bits if (LSB == 0) { MSB--; } else { LSB--; } //Console.Write("."); } // stats TimeSpan elapsed = DateTime.Now.Subtract(start); double pctCells = (cells * 100.0) / (bits * bits); Console.WriteLine(pctZero.ToString("p") + "\t\t" +elapsed.TotalSeconds.ToString("00.000") + "\t\t" + pctCells.ToString("00.00") + "\t\t" + cells.ToString("00000000") + "\t" + (nnzA * nnzB).ToString("00000000")); } Console.ReadLine(); } } } 
0
source

Just use any FFT polynomial multiplication algorithm and convert all the resulting coefficients that are greater than or equal to 1 in 1.

Example:

 10011 * 10001 [1 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0] * [1 x^4 + 0 x^3 + 0 x^2 + 0 x^1 + 1 x^0] == [1 x^8 + 0 x^7 + 0 x^6 + 1 x^5 + 2 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0] -> [1 x^8 + 0 x^7 + 0 x^6 + 1 x^5 + 1 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0] -> 100110011 

For an example algorithm, check:

http://www.cs.pitt.edu/~kirk/cs1501/animations/FFT.html

BTW, it has linear complexity, i.e. O (n log (n))

See also:

http://everything2.com/title/Multiplication%2520using%2520the%2520Fast%2520Fourier%2520Transform

0
source

All Articles