I need a fast 96-bit 64-bit separation algorithm for a fixed-point math library

I am currently writing a 32.32 fixed-point math library. I managed to make the correct addition, subtraction and multiplication, but I was completely stuck in the division.

A small reminder for those who don’t remember: 32.32 fixed point number is a number containing 32 bits of the integer part and 32 bits of the fractional part.

The best algorithm I've come across requires 96-bit integer division, which, like compilers, usually has no built-in modules.

Anyway, here goes:

G = 2^32 notation: x is the 64-bit fixed-point number, x1 is its low nibble and x2 is its high G*(a/b) = ((a1 + a2*G) / (b1 + b2*G))*G // Decompose this G*(a/b) = (a1*G) / (b1*G + b2) + (a2*G*G) / (b1*G + b2) 

As you can see, (a2*G*G) guaranteed to be larger than a regular 64-bit integer. If uint128_t was actually supported by my compiler, I would just do the following:

 ((uint128_t)x << 32) / y) 

Well, that is not the case, and I need a solution. Thank you for your help.

+7
performance c algorithm division
source share
4 answers

You can decompose a larger division into several pieces, which are divided into fewer bits. As another poster has already been mentioned, the algorithm can be found in Knuth's TAOCP.

However, no need to buy a book!

The hackers delight website has code that implements the algorithm in C. It is written for 64-bit unsigned sections using only 32-bit arithmetic, so you cannot directly cut the code. To get from 64 to 128 bits, you need to expand all types, masks and constants by two, for example. short becomes int, a 0xffff becomes 0xffffffffll ect.

After this simple easy change, you can do 128-bit divisions.

The code is here: http://www.hackersdelight.org/HDcode/divlu.c (it can get much worse in the web browser due to the end of the line. Code and open it with notepad or so).

Since your largest values ​​only require 96 bits, one of the 64-bit divs will always return zero, so you can even simplify the code a bit.

Oh - and before I forget this: The code only works with unsigned values. To convert from signed to unsigned separation, you can do something like this (pseudo-code style):

 fixpoint Divide (fixpoint a, fixpoint b) { // check if the integers are of different sign: fixpoint sign_difference = a ^ b; // do unsigned division: fixpoint x = unsigned_divide (abs(a), abs(b)); // if the signs have been different: negate the result. if (sign_difference < 0) { x = -x; } return x; } 

The site itself deserves attention: http://www.hackersdelight.org/

Hope this helps.

Btw is a good task you are working on. Do you mind telling us what you need for a fixed-point library?


Btw - The usual shift and subtraction algorithm for division will work.

If you are targeting x86, you can implement it using MMX or SSE. The algorithm relies only on primitive operations, so it can execute pretty quickly.

+7
source share

Better self-tuning answer :
Forgive the C # -tism of the answer, but in all cases the following should work. There is probably a possible solution that will find the right shifts to use them faster, but I should think much deeper than I can now. This should be reasonably effective, though:

 int upshift = 32; ulong mask = 0xFFFFFFFF00000000; ulong mod = x % y; while ((mod & mask) != 0) { // Current upshift of the remainder would overflow... so adjust y >>= 1; mask <<= 1; upshift--; mod = x % y; } ulong div = ((x / y) << upshift) + (mod << upshift) / y; 

A simple but unsafe answer :
This calculation may cause overflow when moving to a higher level x % y if this remainder has any bits set to high 32 bits, causing an incorrect answer.

 ((x / y) << 32) + ((x % y) << 32) / y 

The first part uses integer division and gives you high response bits (shift them back).

The second part calculates the low bits from the rest of the high-bit division (a bit that cannot be further divided), is shifted, and then divided.

+1
source share

Quick-n-dirty.

Divide A / B floating point with double precision. This gives you C ~ = A / B. It only comes close due to floating point precision and 53 bit mantissa.

Complete C to the represented number on your fixed-point system.

Now calculate (again with your fixed point) D = AC * B. This should have a significantly smaller value than A.

Repeat, now calculating floating point D / B. Again, round the answer to an integer. Add each separation result together as you go. You can stop when your remainder is so small that your divide by floating point returns 0 after rounding.

You are still not done. Now you are very close to the answer, but the units were not accurate. To complete the work, you will have to perform a binary search. Using a (very good) initial estimate, see if this will increase the error. You basically want to copy the correct answer and continue to split the range in half with new tests.

Yes, you can iterate over Newton here, but binary search will most likely be simpler, since you only need simple multiplication and addition using existing 32.32 precision tools.

This is not the most efficient method, but it is easiest to code.

+1
source share

I like Niels answer, which is probably the best. This is just a long division, as we all studied in elementary school, with the exception of the numbers 2 ^ 32 instead of base 10.

However, you can also consider the Newton approximation method for division:

  x := x (N + N - N * D * x) 

where N is the numerator and D is the demonizer.

It just uses the factors and adds that you already have, and it converges very quickly to approximately 1 ULP accuracy. On the other hand, you will not be able to get an accurate 0.5-ULP answer in all cases.

In any case, a complex bit detects and processes overflows.

0
source share

All Articles