All modern GPUs have a scalar architecture, but shading languages offer many vector and matrix types. I would like to know how GLSL source code affects or scans performance. For example, define some “scalar” points:
float p0x, p0y, p1x, p1y, p2x, p2y, p3x, p3y, p4x, p4y; p0x = 0.0f; p0y = 0.0f; p1x = 0.0f; p1y = 0.61f; p2x = 0.9f; p2y = 0.4f; p3x = 1.0f; p3y = 1.0f;
and their vector equivalents:
vec2 p0 = vec2(p0x, p0y); vec2 p1 = vec2(p1x, p1y); vec2 p2 = vec2(p2x, p2y); vec2 p3 = vec2(p3x, p3y);
With these points, which of the following mathematically equivalent parts of the code will work faster?
Scalar code:
position.x = -p0x*pow(t-1.0,3.0)+p3x*(t*t*t)+p1x*t*pow(t-1.0,2.0)*3.0-p2x*(t*t)*(t-1.0)*3.0; position.y = -p0y*pow(t-1.0,3.0)+p3y*(t*t*t)+p1y*t*pow(t-1.0,2.0)*3.0-p2y*(t*t)*(t-1.0)*3.0;
or its vector equivalent:
position.xy = -p0*pow(t-1.0,3.0)+p3*(t*t*t)+p1*t*pow(t-1.0,2.0)*3.0-p2*(t*t)*(t-1.0)*3.0;
?
Or will they run on modern GPUs as fast?
The above code is just an example. Real-world examples of such “vectorized” code can perform much more difficult calculations with much more input variables coming from global in s, uniforms, and vertex attributes.
performance vectorization opengl glsl
Sergey
source share