Architecture for rendering sprites in OpenGL

Question

Architecture for rendering sprites in OpenGL

I have many sprites for rendering, and I wanted to get feedback from people who clicked on performance in this area.

So, I sort by shader and texture. And have batches of sprites with the same rendering settings in VBOs to send shaders for rendering. All normal things. My sprites are square and have the same basic data: center position (P), orientation (O), scale (S), rgb color (Col) and global opacity (Alpha). I need to update the position and orientation in the processor code (although about 50% of the sprites do not change between each given pair of frames), and the scale, color and opacity almost never change for the sprite, but actually never.

I can’t imagine geometric shaders (I will support them, but in this case the issue is debatable).

Should I:

When I update the positions of the sprites, we calculate the positions of the vertices on the processor. Creating a vertex shader is a simple conversion step. (The advantage of significantly less data to update each frame, but the CPU must execute many triggers).
Put the POS data in VBO as additional data, duplicated for 4 vertices, then the vert position will simply be offsets (-1, -1; -1,1; 1,1; 1, -1) and execute the trigger in the shader (advantage in that the GPU does more computation, but each vertex has 5 additional data words).
This is not obvious, which is better, so both approaches need profiling to see what happens.

Obviously, I can do 3, but I thought it would be useful to ask this question to see if I have enough gestalt about what should be faster. And in any case, the answer may help other serious sprite / particle developers later.

+4

3d opengl sprite

Ian Jul 9 '12 at 20:11

source share

3 answers

So, I did (3) and profiled. And, as Kronemi said, option 2 won convincingly.

The best performing structure was two VBOs:

vec2 float pos , float orientation , float scale (16 bytes / vertex)
vec2 float tex , vec4 ubyte color , uint flags (16 bytes / vertex)

If the flags encode the sprite angle, we have 0x00000001 for the right and 0x00000002 for the bottom. This allows the code to update the location of the sprite to go through the first VBO and set the values four at a time without any trigger or other logic. All math happens in the vertex shader.

In my tests, combining two VBOs into one performed better if the number of location updates was not much different from the number of texture / color updates. I assume this is because then the vertices are aligned by 32 bytes. But in my application (and I assume that most people), the position is updated by most frames, but other things never happen, and the smaller buffer for clicking on the graphics card seemed to win.

+3

Ian Jul 17 '12 at 11:55

source share

I found a slight improvement in bandwidth. I believe that each of your sprites has 4 vertices (6 indexes), then you can just use gl_VertexID % 4 instead of flags .

By the attributes of the vertices:

vec2 float position , float orientation , float scale - sprite geometry data (16B)
uint flags - optional flags for special sprites (4B)
float param - optional parameter for smooth sprite transformations (4B)

Outfit:

vec2 vertexPosition[4] - the relative position of each corner - you can use it to indicate the center
vec2 textureCoord[4] - texture coordinate for each corner, you can also use 4*n texcoords for n sprite states that can be determined using flags

This setup uses only 16B for each vertex for simple sprites.

+1

kravemir Feb 04 '13 at 19:35

source share

kroneml · Accepted Answer · 2012-07-18T11:49:08+0000

From my experience with lots of particles, I would use option (2.). Maybe you can pack the offset / direction index into your data (for example, as a w-component of your postion vector, if you are not using it yet)? 0 = (-1, -1); 1 = (-1.1); 2 = (1,1); 3 = (1, -1).

(As Jan suggested, I just copied my comment on the answer!)

@Ian: If I understand you correctly, you said that you have global opacity / alpha, so you can use uniform for this and use the w-component of your vec4 color for flag . However, I doubt it will matter ...

By the way, the shader geometry solution mentioned above should be not only more elegant, but also a little faster.

Architecture for rendering sprites in OpenGL

More articles: