Weโll see if I can explain this, or maybe after reading this you can find a better way to explain this.
The first thing to understand is WebGL, which requires clip coordinates. They go -1 โ +1 in x, y and z. Thus, the perspective matrix is โโmainly intended for placing space inside a truncated cone and converting it into clip space.
If you look at this chart

we know that tangent = opposite (y) over adjacent (z), so if we know z, we can calculate y that will sit on the edge of the truncated cone for a given fovY.
tan(fovY / 2) = y / -z
multiply both sides by -z
y = tan(fovY / 2) * -z
if we define
f = 1 / tan(fovY / 2)
we get
y = -z / f
Note that we did not convert from cameraspace to clip space. All we did was compute y at the edge of the field of view for a given z in cameraspace. The edge of the field of view is also the edge of the clip space. Since the clip space is simply +1 to -1, we can simply split the operator operator y by -z / f to get a clipper.
It makes sense? Look at the chart again. Suppose that blue z was -5 and for some given field of view, y reached +2.34 . We need to convert +2.34 to +1 clips. The common version of this is
clipY = cameraY * f / -z
Looking at `makePerspective '
function makePerspective(fieldOfViewInRadians, aspect, near, far) { var f = Math.tan(Math.PI * 0.5 - 0.5 * fieldOfViewInRadians); var rangeInv = 1.0 / (near - far); return [ f / aspect, 0, 0, 0, 0, f, 0, 0, 0, 0, (near + far) * rangeInv, -1, 0, 0, near * far * rangeInv * 2, 0 ]; };
we can see that f in this case
tan(Math.PI * 0.5 - 0.5 * fovY)
which actually matches
1 / tan(fovY / 2)
Why is it written like this? I suppose because if you had the first style and tan came out to 0, you would divide by 0, your program would crash if that happens if you make it so that there is no division, so there is no way to divide by zero .
Seeing that -1 is in matrix[11] means that when we are all done
matrix[5] = tan(Math.PI * 0.5 - 0.5 * fovY) matrix[11] = -1 clipY = cameraY * matrix[5] / cameraZ * matrix[11]
For clipX we basically do the same calculation, except for scaling for aspect ratio.
matrix[0] = tan(Math.PI * 0.5 - 0.5 * fovY) / aspect matrix[11] = -1 clipX = cameraX * matrix[0] / cameraZ * matrix[11]
Finally, we need to convert cameraZ in the range -zNear โ -zFar to clipZ in the range -1 โ + 1.
The standard perspective matrix does this with a reciprocal function , so the z-values โโclose the camera, getting a higher resolution than the z-values โโfar from the camera. This formula
clipZ = something / cameraZ + constant
Use s for something and c for constant.
clipZ = s / cameraZ + c;
and solve for s and c . In our case, we know
s / -zNear + c = -1 s / -zFar + c = 1
So move `c 'to the other side
s / -zNear = -1 - c s / -zFar = 1 - c
Multiply by -zXXX
s = (-1 - c) * -zNear s = ( 1 - c) * -zFar
These two things are now equal to each other, therefore
(-1 - c) * -zNear = (1 - c) * -zFar
we expand the quantities
(-zNear * -1) - (c * -zNear) = (1 * -zFar) - (c * -zFar)
simplify
zNear + c * zNear = -zFar + c * zFar
move zNear to the right
c * zNear = -zFar + c * zFar - zNear
move c * zFar left
c * zNear - c * zFar = -zFar - zNear
simplify
c * (zNear - zFar) = -(zFar + zNear)
divide by (zNear - zFar)
c = -(zFar + zNear) / (zNear - zFar)
solve for s
s = (1 - -((zFar + zNear) / (zNear - zFar))) * -zFar
simplify
s = (1 + ((zFar + zNear) / (zNear - zFar))) * -zFar
change the value of 1 to (zNear - zFar)
s = ((zNear - zFar + zFar + zNear) / (zNear - zFar)) * -zFar
simplify
s = ((2 * zNear) / (zNear - zFar)) * -zFar
simplify yet
s = (2 * zNear * zFar) / (zNear - zFar)
dang I want the math supported by stackexchange to match their math site :(
to return to the beginning. Our forum was
s / cameraZ + c
And now we know s and c .
clipZ = (2 * zNear * zFar) / (zNear - zFar) / -cameraZ - (zFar + zNear) / (zNear - zFar)
move -z outside
clipZ = ((2 * zNear * zFar) / zNear - ZFar) + (zFar + zNear) / (zNear - zFar) * cameraZ) / -cameraZ
we can change / (zNear - zFar) to * 1 / (zNear - zFar) , therefore
rangeInv = 1 / (zNear - zFar) clipZ = ((2 * zNear * zFar) * rangeInv) + (zFar + zNear) * rangeInv * cameraZ) / -cameraZ
Looking back at makeFrustum , we see that it ends up with
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])
Looking at the formula above, which corresponds to
rangeInv = 1 / (zNear - zFar) matrix[10] = (zFar + zNear) * rangeInv matrix[14] = 2 * zNear * zFar * rangeInv matrix[11] = -1 clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])
Hope that made sense. Note. Most of them are just rewriting this article .