The most effective way to multiscreen modeling - iOS, OpenGL ES2, optimization

I am trying to find the most efficient way to work with multiple textures in OpenGL ES2 on iOS. By "effective" I mean the fastest rendering even on older iOS devices (iPhone 4 and above), but also on the convenience of balancing.

I reviewed (and tried) several different methods. But faced with several problems and questions.

Method 1 My basic and normal values โ€‹โ€‹are rgb with NO ALPHA. For these objects, I do not need transparency. My emission and mirror information is only one channel. To reduce texture2D() calls, I realized that I can store the emission as the alpha channel of the base, and the mirror one as normal alpha. Since each of them is in its own file, it will look like this:

two 2-way textures

My problem so far has been finding a file format that will support full non-multiplex alpha channel. PNG just didn't work for me. Each way that I tried to save this as PNG multiplies .alpha with .rgb to save the file (via Photoshop), basically destroying .rgb. Any pixel with 0.0 alpha has black rgb when I reload the file. I posted this question here without any action.

I know that this method will give a quick visualization if I can work out a way to save and load this independent 4th channel. But so far I could not and had to move on.

Method 2 When this did not work, I switched to one texture with four channels, where each quadrant has a different map. This does not reduce the calls to texture2D() , but it does reduce the number of textures that are accessed in the shader.

one 4-way texture

For a 4-sided texture, you need to change the texture coordinates in the shader. For model flexibility, I leave texcoords as they are in the model structure and change them in the shader like this:

 v_fragmentTexCoord0 = a_vertexTexCoord0 * 0.5; v_fragmentTexCoord1 = v_fragmentTexCoord0 + vec2(0.0, 0.5); // illumination frag is up half v_fragmentTexCoord2 = v_fragmentTexCoord0 + vec2(0.5, 0.5); // shininess frag is up and over v_fragmentTexCoord3 = v_fragmentTexCoord0 + vec2(0.5, 0.0); // normal frag is over half 

To avoid dynamic texture searches (Thanks to Brad Larson ) I moved these offsets to the vertex shader and keep them outside the fragment shader.

But my question here is: Does the number of texture samplers in shader material decrease? Or am I better off using 4 different smaller textures here?

The only problem I encountered was that there was bleeding between the various cards. Texcoord 1.0 was averaged in some blue normal pixels due to linear texture mapping. This added a blue edge to the object near the seam. To prevent this, I had to change the UV map so as not to be too close to the edge. And this pain is associated with so many objects.

Method 3 should combine methods 1 and 2. and have base.rgb + emission.a on the one hand and normal.rgb + specular.a on the other. But again, I still have a problem getting independent alpha to save to a file.

Perhaps I could save them as two files, but merge them at boot time before sending them to openGL. I have to try this.

Method 4 Finally, in the 3D world, if I have 20 different panel textures for walls, should they be separate files or are they all packed into one texture atlas? Recently, I noticed that at some point minecraft switched from satin to separate textures - although they are 16x16 each.

With one model and changing the texture coordinates (which I already do in methods 2 and 3 above), you can easily send the offset to the shader to select a specific map in the atlas:

 v_fragmentTexCoord0 = u_texOffset + a_vertexTexCoord0 * u_texScale; 

This provides more flexibility and reduces the number of texture bindings. This is basically how I do it now in my game. But is IT IT faster to access a small part of the larger texture and have the math in the vertex shader? Or is it faster or more likely to bind smaller textures over and over? Especially if you do not sort objects by texture.

I know this is a lot. But the main question here is the most effective method, taking into account speed + convenience? Will method 4 be faster for multiple textures or will there be many repetitions faster? Or is there another way that I skip. I see all these 3D games with lots of graphics and coverage. How do they support frame rates, especially on older devices like iphone4?

**** UPDATE ****

Since I have had 2 answers in recent days, I will say this. I basically found the answer. Or AN . Question: which method is more efficient ? The value of this method will lead to better frame rates. I tried the various methods above and on the iPhone 5, they are just as fast. The iPhone5 / 5S has extremely fast gpu. Where it matters, on older devices like the iPhone4 / 4S, or on larger devices like the retina iPad. My tests were not scientific, and I have no messages about message rates. But 4 texture2D() calls 4 RGBA textures actually as fast or maybe even faster than 4 texture2D() calls one texture with offsets. And, of course, I do these biased calculations in the vertex shader, and not in the fragment shader (never in the fragment shader).

So maybe someday I will do tests and make a grid with some numbers to report. But I donโ€™t have time to do it right now and write the correct answer. And I cannot mark any other answer that does not answer this question, because SO does not work.

But thanks to the people who answered. And check out this other question of mine, which also answered some of these: Download RGBA image from two jpeg on iOS - OpenGL ES 2.0

+6
source share
2 answers

You have a step in the publishing process in the content pipeline where you combine your rgb with alpha texture and save it in a. Ktx file when you pack the game or as a build event after compilation.

This is a pretty trivial format and it would be easy to write such a command line tool that loads 2 png and combines them into one Ktx, rgb + alpha.

Some of the advantages are less processor overhead when loading a file when the game starts, so games start faster. - Some GPUsos do not support RGB 24-bit format, which will force the driver to internally convert it to rgba 32bit. This adds more time to the loading phase and the use of temporary memory.

Now that you have the data in the texture object, you want to minimize the selection of textures, since this means that many gpu operations and memory accesses are dependent on the filtering mode.

I would recommend having 2 textures with 2 layers each, since there is a problem if you add all of them to the same object, these are potential artifacts when you are sampling with bilinear or mipmapped, since it can include neighboring pixels that are close to the edge where there is one texture and the second starts, or if you decide to create mipmaps.

As an additional improvement, I would recommend not having raw 32-bit rgba data in Ktx, but actually compressing it in dxt or pvrtc format. This will use much less memory, which means faster boot times and fewer memory transfers for gpu, as memory bandwidth is limited. Of course, adding a compressor to the tool after processing is a bit more complicated. Please note that compressed textures lose some quality depending on the algorithm and implementation.

+1
source

Stupid question, but are you sure you are limited to a probe? It just seems to me that with your โ€œtwo-sided texturesโ€ you potentially extract a lot of texture data, and instead you can limit the bandwidth.

What if you have to use 3 textures [BaseRGB, NormalRBG and combined Emission + Specular] and use PVRTC compression? Depending on the details, you can even use 2bpp (rather than 4bpp) for BaseRGB and / or Emission + Specular.

For normals, I probably stick with 4bpp. In addition, if you can afford shader instructions, just save the R & G channels (put 0 in the blue channel) and re-output the blue channel with a bit of math. This should provide the best quality.

0
source

All Articles