Trying to understand the math behind the perspective matrix in WebGL

Question

Trying to understand the math behind the perspective matrix in WebGL

All matrix libraries for WebGL have a perspective function that you call to get a perspective matrix for the scene.
For example, the perspective method in the mat4.js file , which is part of the gl-matrix encoded as such:

 mat4.perspective = function (out, fovy, aspect, near, far) { var f = 1.0 / Math.tan(fovy / 2), nf = 1 / (near - far); out[0] = f / aspect; out[1] = 0; out[2] = 0; out[3] = 0; out[4] = 0; out[5] = f; out[6] = 0; out[7] = 0; out[8] = 0; out[9] = 0; out[10] = (far + near) * nf; out[11] = -1; out[12] = 0; out[13] = 0; out[14] = (2 * far * near) * nf; out[15] = 0; return out; };

I'm really trying to understand what math actually does in this method, but I argue several times.

To begin with, if we have a canvas as follows with a 4: 3 aspect ratio, then the aspect parameter of the method would actually be 4 / 3 , right?

4: 3 aspect ratio

I also noticed that 45 ° seems to be a common field of view. If so, then the fovy parameter will be π / 4 radians, correct?

For all that, what is f variable in a method brief for and what is its purpose?
I tried to imagine a real scenario, and I imagined something like the following:

Side view of [perspective in 3D scene

Thinking this way, I can understand why you divide fovy by 2 , and also why you take the tangent of this relation, but why is it the opposite of what is stored in f ? Again, I have a lot of problems understanding what f really represents.

Then I get the concept of near and far , which are clipping points along the z axis, so that’s fine, but if I use the numbers in the picture above (i.e. π / 4 , 4 / 3 , 10 and 100 ) and connect them to the perspective method , then I get the following matrix:

enter image description here

Where f is:

enter image description here

So, I still have the following questions:

What is f ?
What does the value assigned out[10] (ie 110 / -90 ) mean?
What assigns -1 to out[11] ?
What does the value assigned out[14] (ie 2000 / -90 ) mean?

Finally, I must point out that I already read Gregg Tavares' explanation in the perspective matrix , but after all this I remained with the same confusion.

+8

math matrix opengl-es webgl perspectivecamera

HartleySan Feb 02 '15 at 20:21

source share

2 answers

f is the coefficient that scales the y axis, so that all points along the upper plane of your truncation, post-perspective-division, have y-coordinate 1, and on the lower plane, y-coordinate -1. Try connecting points along one of these planes (examples: 0, 2.41, 1 , 2, 7.24, 3 ), and you can understand why this happens: because it ends with a preliminary division of y equal to uniform w.

0

Sneftel Feb 02 '15 at 20:30

source share

gman · Accepted Answer · 2015-02-03T14:23:32+0000

We’ll see if I can explain this, or maybe after reading this you can find a better way to explain this.

The first thing to understand is WebGL, which requires clip coordinates. They go -1 ↔ +1 in x, y and z. Thus, the perspective matrix is mainly intended for placing space inside a truncated cone and converting it into clip space.

If you look at this chart

we know that tangent = opposite (y) over adjacent (z), so if we know z, we can calculate y that will sit on the edge of the truncated cone for a given fovY.

 tan(fovY / 2) = y / -z

multiply both sides by -z

 y = tan(fovY / 2) * -z

if we define

 f = 1 / tan(fovY / 2)

we get

 y = -z / f

Note that we did not convert from cameraspace to clip space. All we did was compute y at the edge of the field of view for a given z in cameraspace. The edge of the field of view is also the edge of the clip space. Since the clip space is simply +1 to -1, we can simply split the operator operator y by -z / f to get a clipper.

It makes sense? Look at the chart again. Suppose that blue z was -5 and for some given field of view, y reached +2.34 . We need to convert +2.34 to +1 clips. The common version of this is

clipY = cameraY * f / -z

Looking at `makePerspective '

 function makePerspective(fieldOfViewInRadians, aspect, near, far) { var f = Math.tan(Math.PI * 0.5 - 0.5 * fieldOfViewInRadians); var rangeInv = 1.0 / (near - far); return [ f / aspect, 0, 0, 0, 0, f, 0, 0, 0, 0, (near + far) * rangeInv, -1, 0, 0, near * far * rangeInv * 2, 0 ]; };

we can see that f in this case

 tan(Math.PI * 0.5 - 0.5 * fovY)

which actually matches

 1 / tan(fovY / 2)

Why is it written like this? I suppose because if you had the first style and tan came out to 0, you would divide by 0, your program would crash if that happens if you make it so that there is no division, so there is no way to divide by zero .

Seeing that -1 is in matrix[11] means that when we are all done

 matrix[5] = tan(Math.PI * 0.5 - 0.5 * fovY) matrix[11] = -1 clipY = cameraY * matrix[5] / cameraZ * matrix[11]

For clipX we basically do the same calculation, except for scaling for aspect ratio.

 matrix[0] = tan(Math.PI * 0.5 - 0.5 * fovY) / aspect matrix[11] = -1 clipX = cameraX * matrix[0] / cameraZ * matrix[11]

Finally, we need to convert cameraZ in the range -zNear ↔ -zFar to clipZ in the range -1 ↔ + 1.

The standard perspective matrix does this with a reciprocal function , so the z-values close the camera, getting a higher resolution than the z-values far from the camera. This formula

 clipZ = something / cameraZ + constant

Use s for something and c for constant.

 clipZ = s / cameraZ + c;

and solve for s and c . In our case, we know

 s / -zNear + c = -1 s / -zFar + c = 1

So move `c 'to the other side

 s / -zNear = -1 - c s / -zFar = 1 - c

Multiply by -zXXX

 s = (-1 - c) * -zNear s = ( 1 - c) * -zFar

These two things are now equal to each other, therefore

 (-1 - c) * -zNear = (1 - c) * -zFar

we expand the quantities

 (-zNear * -1) - (c * -zNear) = (1 * -zFar) - (c * -zFar)

simplify

 zNear + c * zNear = -zFar + c * zFar

move zNear to the right

 c * zNear = -zFar + c * zFar - zNear

move c * zFar left

 c * zNear - c * zFar = -zFar - zNear

simplify

 c * (zNear - zFar) = -(zFar + zNear)

divide by (zNear - zFar)

 c = -(zFar + zNear) / (zNear - zFar)

solve for s

 s = (1 - -((zFar + zNear) / (zNear - zFar))) * -zFar

simplify

 s = (1 + ((zFar + zNear) / (zNear - zFar))) * -zFar

change the value of 1 to (zNear - zFar)

 s = ((zNear - zFar + zFar + zNear) / (zNear - zFar)) * -zFar

simplify

 s = ((2 * zNear) / (zNear - zFar)) * -zFar

simplify yet

 s = (2 * zNear * zFar) / (zNear - zFar)

dang I want the math supported by stackexchange to match their math site :(

to return to the beginning. Our forum was

 s / cameraZ + c

And now we know s and c .

 clipZ = (2 * zNear * zFar) / (zNear - zFar) / -cameraZ - (zFar + zNear) / (zNear - zFar)

move -z outside

 clipZ = ((2 * zNear * zFar) / zNear - ZFar) + (zFar + zNear) / (zNear - zFar) * cameraZ) / -cameraZ

we can change / (zNear - zFar) to * 1 / (zNear - zFar) , therefore

 rangeInv = 1 / (zNear - zFar) clipZ = ((2 * zNear * zFar) * rangeInv) + (zFar + zNear) * rangeInv * cameraZ) / -cameraZ

Looking back at makeFrustum , we see that it ends up with

 clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

Looking at the formula above, which corresponds to

 rangeInv = 1 / (zNear - zFar) matrix[10] = (zFar + zNear) * rangeInv matrix[14] = 2 * zNear * zFar * rangeInv matrix[11] = -1 clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

Hope that made sense. Note. Most of them are just rewriting this article .

Trying to understand the math behind the perspective matrix in WebGL

More articles: