What is a simple example of replacing c-code with an assembly for better performance?

I heard that game developers sometimes replace parts of internal loops with assembly code to improve performance.

What is a simple example?

Where is the assembly going? Just a line with c code?

Thanks!

Edit: Sample code is greatly appreciated.

+4
source share
8 answers

I'm not a game developer, but I write almost nothing but build code for life (I'm a library writer). This is usually for performance reasons, but I also do this to circumvent compiler errors or use hardware features such as condition flags, which are actually easier to express in an assembly than in C.

I usually write complete functions in an assembly, so I prefer to write .s files that are compiled into object files and associated with an executable or library. People who just need to move one loop into an assembly often use the built-in assembly in their C source, which is supported by most compilers through some kind of their own.

"Simple" examples are quite rare, because if it were simple, the compiler would do a pretty good job and assembly was not required.

+7
source

Here are the build profiles:

  • Build code can take advantage of unique processor commands, as well as various specialized hardware resources. C code, on the other hand, is generic and must support various hardware platforms. Thus, C is difficult to support platform-specific code.

  • The build programmer is usually familiar with the application and may make assumptions that are not available to the compiler.

  • Build programmer can use human creativity; the compiler, however, is just an automatic program.

On the other hand, here are the drawbacks of the assembly:

  • The build programmer must handle time-consuming machine-level problems, such as register allocation and instruction scheduling. With C code, these problems are solved by the compiler.

  • Subscriber coding requires specialized knowledge of the DSP architecture and its set of commands, while C-coding requires knowledge of the C language, which is quite common.

  • With assembler, it is very difficult and time-consuming to transfer applications from one platform to another. Porting is relatively easy for C applications.

from here

+4
source

Here is a simple example (ish) - Exchange code for Watt-32 .

In this case, __asm used to integrate assembly code inside the string with C / C ++ code for performance. Since this is an alternative cross-platform network stack, there are many points when it is important to keep performance as critical as possible.

+2
source

I refused to build code a few years ago when I discovered that the optimizing C ++ compiler would beat me when it comes to performance, because the people who create the optimizer consider all kinds of things, such as pipelines, partially parallel execution of subsequent, independent code fragments (a good optimizer can change your code to a fair bit), and assembly code flaws (hard-to-read, hard-to-debug, and not portable) far outweigh the advantages that it used in t e days when compilers didn I have good optimizers.

If I were you, I would not bother with assembly coding for ordinary programming tasks. Even if you can keep the processor clock speed here or there, looking at the overall performance of a complex application, the effect is negligible.

+2
source

The syntax will depend on your compiler; I use gcc and supports several different ways to embed assembler code.

See this link for descriptions and examples: http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s4

+1
source

Most modern PC, Xbox 360, or PS3 games have very few built-in builds. Modern optimizing compilers do a pretty good job of scheduling commands and register allocation, so increasing performance from writing an inline assembly is rarely worth the effort. Native builds are not even supported for 64-bit code in Visual Studio.

Embedded assembly is important for accessing hardware specific instructions that the compiler will not automatically use. With modern built-in compilers, methods of accessing specific instructions on the hardware are preferred. In games, intrinsics are often used for mathematical heavy code to access vector math application specifications (using SSE on PC or VMX on Xbox 360 / PS3 PPU or SPU instruction set on PSU SPU). Intrinsics are extensions for the platform / compiler that are similar to the usual C / C ++ functions, but directly display individual instructions on the underlying hardware. Documentation for x86 and x64 in Visual Studio can be found on MSDN .

In some games you can still find some really critical bits of code written in the assembly, but in general, entire functions will be written in the assembly, rather than using the built-in assembly bit in C / C ++ code. I have not seen any built-in builds in any PC / Xbox 360 / PS3 games in any of the code that I have been working on for the last 5 years or so.

+1
source

Michael Abrash wrote a book called The Black Book of Graphics Programming . It is definitely worth a read. You can get PDF files for free online here .

Michael Abrash Classical Graphics Programming The Black Book is a collection of Michael’s previous works in assembly language and graphical programming (including in his column “Graphical Programming” in Dr. Dobb Journal). Most of this book is about profiling and testing code, as well as optimizing performance. It also discusses many of the 3D technologies of Doom and Quake games, as well as three-dimensional graphic tasks, such as displaying textures, removing a hidden surface, etc. Thanks to Michael for making this book.

+1
source

Even if you use some kind of api like __asm ​​for inline assembler code, there is overhead. The compiler will first reset all yr registers (or those that you use in your built-in code, depending on which compiler decides to optimize), and then embed your code, and then restore these registers. I feel that if there is no significant advantage of embedding assembly code, this should be avoided, given the tradeoff between maintainability, porting, correctness, readability, and performance.

+1
source

All Articles