Fast multiplication and subtraction modulo simple

I need to optimize some code, where I multiply the vector ints (32 bits) by scalar modulo p (where p is a prime number (2 ^ 32) -5), and then subtract this vector from another vector modulo p,

The code is as follows:

public static void multiplyAndSubtract(long fragmentCoefficient, long[] equationToSubtractFrom, long[] equationToSubtract) { for (int i = 0; i < equationToSubtractFrom.length; i++) { equationToSubtractFrom[i] = modP(equationToSubtractFrom[i] - multiplyModP(fragmentCoefficient, equationToSubtract[i])); } } 

I use longs because Java does not support unsigned integers, but both vectors are mod p, so you can expect each number to be 0 <= x <(2 ^ 32) -5

Any ideas to optimize this? The mod p operation is what takes most of the execution time, so one way to optimize it may somehow not do modP after multiplication and only do it after subtraction. Any ideas on how to do this?

+8
java performance optimization primes modulo
source share
3 answers

You can speed up the calculation and avoid any division at all, using the fact that 2 ^ 32 = 5 (mod p).

After multiplying and subtracting, we divide the result into low (x% 2 ^ 32) and hi (x / 2 ^ 32) parts. Then multiply hi by 5 and add to the bottom. Then repeat this procedure again. If the result is greater than p, subtract p. For a negative result, add p.

Edit: since combined multiplication and subtraction can overflow, the result of the multiplication must also be accepted modulo p. But just one step of the above procedure is enough: just split, multiply by 5 and add.

+4
source share
 e - (f * e mod p) mod p = (ee f) mod p 

See Wolfram Alpha .

+6
source share

I know two ways to do this without using a division or module:

Method 1: Invariant Multiplication ( see this article )

The main idea here is to pre-calculate and approximate the inverse function p , which allows you to perform integer division using only a pair of integer multiplications. Then you can multiply and get your module. This is a simpler implementation approach.

Method 2: (the one that I usually use) is to use floating point. Convert the numbers to a floating point number and multiply by the pre-calculated p value. Then round and convert back to an integer. This approach is more difficult to achieve, but in my experience it is faster if everything is done correctly.

Both approaches here do not include any divisions, except for the preliminary calculation of the inverse of either integer or floating point.

Whether any of these methods are faster than using % directly will depend on how well you can implement them. Therefore, I can not guarantee that one of them will be faster.

+1
source share

All Articles