Effectively create a textureless particle system

I am trying to create a particle system where, instead of a texture, a square is rendered with a fragmented shader, such as below.

uniform vec3 color; uniform float radius; uniform float edge; uniform vec2 position; uniform float alpha; void main() { float dist = distance(gl_FragCoord.xy, position); float intensity = smoothstep(dist-edge, dist+edge, radius); gl_FragColor = vec4(color, intensity*alpha); } 

Each particle is an object of the C ++ class, which combines this shader and all variables together and draws it. I use openFrameworks, so the exact openGL calls are hidden from me.

I read that usually particle systems run with textures, but I prefer to do it this way because in this way I can add more functionality to the particles. The problem is that after only 30 particles, the frame rate drops sharply. Is there a more efficient way to do this? I thought about putting variables for each particle into an array and sending these arrays into one fragment shader, which then displays all the particles at a time. But this will mean that the number of particles will be fixed, because it will be necessary to declare single arrays in the shader in advance.

Non-systemic particle systems are simply too inefficient to be realistic, or is there a way to design this, what am I missing?

0
source share
1 answer

The reason is textures, because you can move particles using the GPU, which is very fast. You must double the texture buffer that stores the attributes of the particles (e.g., position) on the texel and the ping-pong data between them, using a framebuffer object to draw for them and a fragment shader to perform the calculation, rendering a full-screen polygon. Then you will draw an array of quads and read the texture to get the positions.

Instead of texture preservation attributes, you can pack them directly into VBO data. This gets complicated because you have several vertices per particle, but you can still do several ways. glVertexBindingDivisor ( requires instancing ), draws points or uses a geometric shader. Transformer feedback or image_load_store can be used to update VBOs using the GPU instead of textures.

If you are moving particles with a processor, you also need to copy the data to the GPU in each frame. This is pretty slow, but nothing, like 30 particles, is a slow problem. This is probably due to the number of draw calls . Each time you draw something, a ton of GL material does to customize the operation. Setting uniform values ​​for primitive (almost) is very expensive for the same reason. Particles work well if you have arrays of data that are processed by the manager all at once. In such cases, they are very well parallelized. As a rule, their calculations are cheap, and it all comes down to minimizing memory and maintaining good locality.

If you want the particle to update the processor part, I would go with this:

  • Create a VBO filled with -1 to 1 quads (two triangles, 6 vertices) and a buffer of the array of elements to draw them. This buffer will remain stationary in the GPU's memory, and this is what you use to draw particles at the same time with a single draw call.
  • Create a texture (maybe 1D) or VBO (if you choose one of the above methods) that contains the positions and attributes of particles that update almost every frame (using glTexImage1D / glBufferData / glMapBuffer ).
  • Create another texture with particle attributes that are rarely updated (for example, only when they appear). You can send updates using glTexSubImage1D / glBufferSubData / glMapBufferRange .

When you draw particles, read the position and other attributes from the texture (or attributes if you used VBOs), and use -1 to 1 quads in the main VBO geometry as offsets to your position.

+2
source

All Articles