I have a shader where I want to move half the vertices in the vertex shader. I am trying to decide the best way to do this in terms of performance, because we are dealing with more than 100,000 vert, so speed is critical. I looked at 3 different methods: (pseudocode, but enough to give you an idea. <complex formula> I can’t give out, but I can say that it includes the sin() function and also the function call (just returns a number, but still a function call), as well as a bunch of basic arithmetic for floating point numbers).
if (y < 0.5) { x += <complex formula>; }
This has the advantage that the <complex formula> only takes half the time, but the disadvantage is that it definitely calls a branch, which can actually be slower than the formula. This is the most readable, but in this context we need speed more than readability.
x += step(y, 0.5) * <complex formula>;
Using the HLSL function step () (which returns 0 if the first parameter is greater and 1 if less), you can delete the branch, but now the <complex formula> is called every time, and its results are multiplied by 0 (thus wasted effort ) half the time.
x += (y < 0.5) ? <complex formula> : 0;
I don’t know about that. Does it cause ?: branch? And if not, then are both sides of the equation being evaluated or only relevant?
The final possibility is that the <complex formula> may be offloaded back to the CPU instead of the GPU, but I'm worried that it will be slower to compute sin () and other operations, which can lead to net losses. In addition, this means that another number must be passed to the shader, which can also cause overhead. Does anyone have an idea which one would be the best way?
Addendum:
According to http://msdn.microsoft.com/en-us/library/windows/desktop/bb509665%28v=vs.85%29.aspx
The step() function uses ?: Internally, so it is probably no better than my third solution, and potentially worse, since <complex formula> definitely called every time, whereas it can only be called half the time using the direct ?: . (No one has answered this part of the question yet.) Although to avoid both of them:
x += (1.0 - y) * <complex formula>;
may be better than any of them, since no comparison is made anywhere. (And y always either 0 or 1.) Performs a <complex formula> without the need for half the time, but may be worth it to avoid branches at all.