C ++ exp LUT (lookup table)

Question

C ++ exp LUT (lookup table)

In the simulation related to the C ++ processor that I am writing, I traced the bottleneck through valgrind in my program to cmath::exp . It currently consumes over 40% of my simulation time. I can associate the input with a relatively small domain, but I would like to control the accuracy. I was thinking about going to LUT (lookup table) instead of exp , but I'm not quite sure how to do the “right way” (tm). I have problems:

Large lookup tables do not cache, thereby slowing down access.
Best way to convert binary input to integer for access to lookup table
The answer to (2) depends on the slope of the input function?
I reinvent the wheel - has it been done before?

What is the best way to implement / (include from a library) LUT for exp ?

+7

c ++ profiling lookup-tables

Hooked Jul 25 '12 at 20:42

source share

3 answers

The optimal size of the lookup table is determined by the trade-off between efficiency, accuracy, and implementation complexity. You will have to profile, we can not tell you the answer (we do not know the answer).
Use lrint from <math.h> to convert double to long int . I'm not sure if this is in <cmath> .
I'm not sure if the slope has anything to do with converting floating-point numbers to integers. Could you tell us about your concerns?
Yes, you are reinventing the wheel. What you do has been done over and over again by anyone who has ever implemented a math library. There is a lot of literature on this subject.

A direct lookup table is far from optimal. You will want to use some kind of polynomial approximation, possibly piecewise, with coefficients selected from the lookup table. For a function as smooth and predictable as exp , the polynomial will give you much higher accuracy with the same amount of computational effort. The necessary polynomials will depend on the trade-off between complexity and accuracy, and whether you want to minimize the expected error, minimize the maximum error, or use some other loss functions.

Limiting the exp domain doesn’t really help much, as it spreads so easily across the entire domain.

 // only works for x in [0, 1] double exp2_limited(double x); // works for all x, but doesn't handle overflow properly double exp2(double x) { return scalbn(exp2_limited(fmod(x, 1.0)), (long) floor(x)); }

Summary:

You must know the required accuracy before you can create such a function.
You also need to know the loss function (i.e. select the loss function).
You need a profile before you know how fast it is.
Use polynomials.

+1

Dietrich epp Jul 25 '12 at 21:09

source share

I had this problem and tried several stack samples to diagnose it. What this does is indicate where the calls are coming from and what the value of the argument is. I found that when calling exp from certain places, the value of the argument was strongly repeatable.

This suggested a memoization approach that made a huge difference.

He needs a simple wrapper function:

 double exp_cached(double arg, double* old_arg, double* old_result){ if (arg== *old_arg) return *old_result; *old_arg = arg; *old_result = exp(arg); return *old_result; }

and wherever exp(foo) is called, do:

 static double old_arg = -999999999, old_result; ... ... exp_cached(foo, &old_arg, &old_result)...

Thus, exp not called if its argument in the place where it is called has the same argument value as before.

+1

Mike dunlavey Jul 26 '12 at 12:01

source share

Ben voigt · Accepted Answer · 2012-07-25T21:14:00+0000

A very similar question was asked earlier. Here is my answer:

Quick Exp calculation: can you increase accuracy without losing too much performance?

This approach was proposed by the author of this question, and I was able to effectively implement it (a small search table and minimal additional work after the search). This is in C #, but translating to C ++ should be simple, and I can help if you run into difficulties.

C ++ exp LUT (lookup table)

More articles: