The result of our own implementation of double precision cos () in the shader is NaN, but it works well on the processor. What is going wrong?

as I said, I want to implement my own double-precision cos () function in a computational shader with GLSL, because there is only a built-in version for float.

This is my code:

double faculty[41];//values are calculated at the beginning of main() double myCOS(double x) { double sum,tempExp,sign; sum = 1.0; tempExp = 1.0; sign = -1.0; for(int i = 1; i <= 30; i++) { tempExp *= x; if(i % 2 == 0){ sum = sum + (sign * (tempExp / faculty[i])); sign *= -1.0; } } return sum; } 

The result of this code is that the sum is NaN on the shader, but the algorithm works well on the processor. I also tried debugging this code and got the following information:

  • Faculty [i] is positive and non-zero for all entries
  • tempExp is positive at every step
  • none of the other variables is NaN at every step
  • the first total amount is NaN in step c i = 4

and now my question is: what exactly can go wrong if each variable is a number and nothing is divided by zero, especially when the algorithm works on the CPU?

+8
c ++ shader opengl glsl
source share
2 answers

Let me guess:

First, you determined that the problem is in the loop, and you only use the following operations: + , * , / .

The rules for generating NaN from these operations are:

  • Sections 0/0 and ยฑโˆž/ยฑโˆž
  • Multiplications 0ร—ยฑโˆž and ยฑโˆžร—0
  • Additions โˆž + (โˆ’โˆž) , (โˆ’โˆž) + โˆž and equivalent subtractions

You have ruled out the possibility of 0/0 and ยฑโˆž/ยฑโˆž , indicating that faculty[] correctly initialized.

The sign variable is always 1.0 or -1.0 , so it cannot generate a NaN operation via * .

Operation + remains if tempExp ever became ยฑโˆž .

Thus, it is likely that tempExp too high when entering your function and becomes ยฑโˆž , this will make sum equal to ยฑโˆž . At the next iteration, you initiate the NaN generation operation with: โˆž + (โˆ’โˆž) . This is due to the fact that you multiply one side of addition by sign and at each iteration you can switch between positive and negative.

You are trying to bring cos(x) around 0.0 . Therefore, you should use the properties of the cos() function to reduce your input value to around 0.0 . Ideal in the range [0, pi/4] . For example, remove the 2*pi multiplicity and get the cos() values โ€‹โ€‹in [pi/4, pi/2] , calculating sin(x) around 0.0 , etc.

+1
source share

What can go dramatically is a loss of accuracy. cos(x) usually implemented by decreasing the range, followed by a special implementation for the range [0, pi/2] . Range reduction uses cos(x+2*pi) = cos(x) . But this range reduction is not perfect. First, pi cannot be represented exactly in finite mathematics.

Now, what happens if you try something absurd, like cos(1<<30) ? It is possible that the range reduction algorithm introduces an error in x that is greater than 2*pi , in which case the result does not make sense. The return of NaN in such cases is reasonable.

0
source share

All Articles