Gcc / g ++ output type

I know this is a very simple question, but when I compile my c / C ++ code with gcc / g ++, what exactly is the type of intermediate output before the assembler starts playing in order to generate machine code? Is this something like X86 instructions?

+8
c ++ c gcc
source share
6 answers

The GCC processing chain is as follows:

  • your source code

  • pre-processed source code (expand macros and enable, remove comments) ( -E , .ii )

  • compile assembly ( -S , .s )

  • compiled in binary ( -c , .o )

  • link to executable file

At each step, I listed the corresponding compiler flags that cause the process to stop there, as well as the corresponding file suffix.

If you compile -flto , then the object files will be decorated with GIMPLE bytecode, which is a type of intermediate low-level format, the purpose of which is to delay the actual final compilation at the build stage, which allows to optimize the connection time.

The “assembly” is a really heavy lifting part. The preprocessor is essentially a separate independent tool (although its behavior is prescribed by the C and C ++ standards), and the assembler and linker are separate stand-alone tools that basically just implement, respectively, the format of the hardware binary instruction and the loaded executable file of the operating system.

+12
source share

So, compiling an executable file in GCC consists of 4 parts:

1.) Preprocessing (gcc -E main.c> main.i; converts * .c to * .i) Enables extension, processes marcos. Deletes comments.

2.) Compilation (gcc -S main.i; converts * .i to * .s if successful) Compiles the C code for the assembler (on the target x86 architecture it is the x86 assembly, on the target x86_64 architecture it is the x64 assembly on target hand architecture — hand assembly, etc.) Most warnings and errors occur during this part (for example, error and warning reports)

3.) Assembly (as main.s -o main.o; converts * .i to * .o, again if successful) Assemblies generate assembler for machine code. Although there is still a relative address of the procedures, etc.

4.) Binding (gcc main.o) Replaces relative addresses with absolute addresses. Removes useless text. Binding errors and warnings at this point. And in the end (if successful) we get an executable file.

So, to answer your question, the intermediate conclusion that you have in mind is actually called assembly language - see the wiki about this assembler assembly language .

+4
source share

Here's a graphical representation of the gcc compilation steps using the redhat log :

gcc compilation steps

Unlike other answers, there is no build step - rather, generating assembler code replaces the creation of object code; it doesn't make sense to convert the in-memory representation to a text one if what you really want is a binary representation.

+2
source share

It must be a build code. You can get it using the -S flag on the command line to compile.

0
source share

There is no intermediate exit. The first conclusion you get is machine code. (Although you can get intermediate C / C ++ output by calling only the preprocessor with -E .)

0
source share

The GCC toolchain compiles a program from source to machine code. The compiler generates assembly code, which the assembler compiles into machine code. Here is a good primer for beginners.

0
source share

All Articles