Multiplication of two 128-bit ints

I am trying to propagate two 128 bit integers in C.

Here is my algorithm:

Divide the two 128-bit sequences into S1 and S2.

Then we divide S1 into S11 (upper / upper half) and S12 (rear / lower half) and divide S2 into S21 (front / upper half) and S22 (rear / lower half).

Multiply S12 by S22 ... = S1222.

Multiply S11 by S21 ... = S1121 and then bit shift it by multiplying it by 2 ^ 128

Combine S1222 and S1121 as the front and back halves of the new array. Let me call it "Array1". The length of the new array is two times longer than S1.

Then we must multiply S12 by S21 and S11 by S22. I multiplied these two to get S1221 and S1122, respectively (and the bit shifted them accordingly). Now I have to add them to Array1. This is the part where I ask for help. I am not sure how to add them one by one in Array1. Keep in mind that there may be a carryover of 1, since you are gradually moving from 3/4 from Array1 to 1/4 of Array1, as this is the range to which you add S1221 and S1122.

My question is: How to add dstM1 and dstM2 to an array d that is already full?

+6
source share
3 answers

To summarize your question: how can you add two unsigned arrays of integers that propagate hyphenation.

uint16_t foo[4]; // 0000 aaaa FFFF cccc uint16_t bar[4]; // dddd eeee FFFF 0000 

Well, FFFF + FFFF + 1 is just (1) FFFF. Thus, hyphenation can always be added to each word without additional hyphenation (as if the sum could be 20,000).

Creating a temporary amount: sum = foo[3] + bar[3] + carry; with the transfer, initially 0, or this amount creates a new transfer, or not.

  • Carry is made from (A + B) if (A+B) < A
  • When summing (A + B + c), the transfer is performed if ((A + c) < A) || (((A + c) + B) < B) ((A + c) < A) || (((A + c) + B) < B)

Another possibility is to compute "multi-bit portability" by summing several members in columns, which is often found in bignom multiplications:

  AAAA BBBB CCCC DDDD EEEE FFFF .... GGGG HHHH IIII .... .... -------------------------- col4 col3 col2 col1 col0 

Now each column creates a 32-bit or 64-bit result and a carry, which does not necessarily correspond to one bit.

 uint32_t sum_low = carry_in_from_previous_column; uint32_t sum_high = 0; for (i = 0; i < N; i++) { sum_low += matrix[i][column] & 0xffff; sum_high += matrix[i][column] >> 16; } sum_high += sum_low >> 16; // add the multibit half carry result = (sum_low & 0xffff) | (sum_high << 16); carry_out = sum_high >> 16; 
+1
source

If you use gcc or clang, you can directly use __int128 and unsigned __int128 .

+4
source

You are stuck in an infinite loop because i += 1/32 same as i += 0 .

Also: note: memcpy(&d[3l/2-i], dstM1, 1/8); is memcpy(&d[1-i], dstM1, 0);

+3
source

All Articles