The result of our own implementation of double precision cos () in the shader is NaN, but it works well on the processor. What is going wrong?

Question

The result of our own implementation of double precision cos () in the shader is NaN, but it works well on the processor. What is going wrong?

as I said, I want to implement my own double-precision cos () function in a computational shader with GLSL, because there is only a built-in version for float.

This is my code:

double faculty[41];//values are calculated at the beginning of main() double myCOS(double x) { double sum,tempExp,sign; sum = 1.0; tempExp = 1.0; sign = -1.0; for(int i = 1; i <= 30; i++) { tempExp *= x; if(i % 2 == 0){ sum = sum + (sign * (tempExp / faculty[i])); sign *= -1.0; } } return sum; }

The result of this code is that the sum is NaN on the shader, but the algorithm works well on the processor. I also tried debugging this code and got the following information:

Faculty [i] is positive and non-zero for all entries
tempExp is positive at every step
none of the other variables is NaN at every step
the first total amount is NaN in step c i = 4

and now my question is: what exactly can go wrong if each variable is a number and nothing is divided by zero, especially when the algorithm works on the CPU?

+8

c ++ shader opengl glsl

DanceIgel Mar 05 '15 at 11:50

source share

2 answers

fjardon · Answer 1 · 2015-11-13T13:14:41+0000

Let me guess:

First, you determined that the problem is in the loop, and you only use the following operations: + , * , / .

The rules for generating NaN from these operations are:

Sections 0/0 and ±∞/±∞
Multiplications 0×±∞ and ±∞×0
Additions ∞ + (−∞) , (−∞) + ∞ and equivalent subtractions

You have ruled out the possibility of 0/0 and ±∞/±∞ , indicating that faculty[] correctly initialized.

The sign variable is always 1.0 or -1.0 , so it cannot generate a NaN operation via * .

Operation + remains if tempExp ever became ±∞ .

Thus, it is likely that tempExp too high when entering your function and becomes ±∞ , this will make sum equal to ±∞ . At the next iteration, you initiate the NaN generation operation with: ∞ + (−∞) . This is due to the fact that you multiply one side of addition by sign and at each iteration you can switch between positive and negative.

You are trying to bring cos(x) around 0.0 . Therefore, you should use the properties of the cos() function to reduce your input value to around 0.0 . Ideal in the range [0, pi/4] . For example, remove the 2*pi multiplicity and get the cos() values in [pi/4, pi/2] , calculating sin(x) around 0.0 , etc.

Msalters · Answer 2 · 2015-11-13T13:30:01+0000

What can go dramatically is a loss of accuracy. cos(x) usually implemented by decreasing the range, followed by a special implementation for the range [0, pi/2] . Range reduction uses cos(x+2*pi) = cos(x) . But this range reduction is not perfect. First, pi cannot be represented exactly in finite mathematics.

Now, what happens if you try something absurd, like cos(1<<30) ? It is possible that the range reduction algorithm introduces an error in x that is greater than 2*pi , in which case the result does not make sense. The return of NaN in such cases is reasonable.

The result of our own implementation of double precision cos () in the shader is NaN, but it works well on the processor. What is going wrong?

More articles: