objcopy -O binary copies the contents of the source file. Here test.o is a "relocatable object file": this code, as well as a symbol table and relocation information that allows you to link the file to other files in an executable program. The test.bin file created by objcopy contains only code, a character table, or movement information. Such a "raw" file is useless for "normal" programming, but convenient for code that has its own loader.
I assume that you are using Linux on a x86 32-bit system. Your test.o file is 515 bytes in size. If you try objdump -x test.o , you get the following that describes the contents of the test.o object file:
$ objdump -x test.o test.o: file format elf32-i386 test.o architecture: i386, flags 0x00000010: HAS_SYMS start address 0x00000000 Sections: Idx Name Size VMA LMA File off Algn 0 .text 0000001e 00000000 00000000 00000034 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .data 00000000 00000000 00000000 00000054 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 00000054 2**2 ALLOC SYMBOL TABLE: 00000000 ld .text 00000000 .text 00000000 ld .data 00000000 .data 00000000 ld .bss 00000000 .bss 0000000b l .text 00000000 start 00000005 l .text 00000000 str
This gives you quite a bit of information. In particular, the file contains a section called .text , starting with an offset of 0x34 in the file (this is 52 in decimal value) and a length of 0x1e bytes (30 in decimal form). You can parse it to see the operation codes themselves:
$ objdump -d test.o test.o: file format elf32-i386 Disassembly of section .text: 00000000 <str-0x5>: 0: e8 06 00 00 00 call b <start> 00000005 <str>: 5: 74 65 je 6c <start+0x61> 7: 73 74 jae 7d <start+0x72> 9: 0a 00 or (%eax),%al 0000000b <start>: b: b8 04 00 00 00 mov $0x4,%eax 10: bb 01 00 00 00 mov $0x1,%ebx 15: 59 pop %ecx 16: ba 05 00 00 00 mov $0x5,%edx 1b: cd 80 int $0x80 1d: c3 ret
This is more or less the assembly you started with. The je , jae and or codes in the middle are false: this objdump tries to interpret the literal string ( "test\n" , leading to bytes 0x74 0x65 0x73 0x64 0x0a 0x00) as operation codes. objdump -d also shows the actual bytes found in the .text section, that is, the bytes in the file starting at offset 0x34. First bytes: 0xe8 0x06 0x00 ...
Now view the test.bin file. It has a length of 30 bytes. Let's look at these bytes in hexadecimal format:
$ hd test.bin 00000000 e8 06 00 00 00 74 65 73 74 0a 00 b8 04 00 00 00 |.....test.......| 00000010 bb 01 00 00 00 59 ba 05 00 00 00 cd 80 c3 |.....Y........|
here we accurately recognize 30 bytes from the .text section in test.o This is what objcopy -O binary did: it extracted the contents of the file, that is, the only non-empty section, that is, the unprocessed operation codes themselves, deleting everything else, in particular the character table and move information.
Moving is what needs to be changed in a given piece of code in order for it to work correctly when stored in a specific place in memory. For example, if the code uses a variable and wants to get the address of this variable, then the move information will contain an entry telling the one who actually places the code in memory (usually the linker): "here, in the code, when you know where you really are will be a variable, write the address of the variable. " Interestingly, the code that you show does not require transfer: a sequence of bytes can be written in an arbitrary memory cell and executed as is.
Let's see what the code does.
- The operating code
call goes to the mov command with offset 0x0b. Also, since it is a call , it pushes the return address on the stack. The return address is where execution should continue after the call ends, that is, when the ret code is reached. This is the byte address after the call operation code. Here, this address is the address of the first byte of the string lit "test\n" . - Two
movl load %eax and %ebx with numeric values ββof 4 and 1. respectively. - The op
pop removes the top element from the stack, storing it in %ecx . What is this top element? The exact address pushed onto the stack by the call operation code, i.e. the address of the first byte of the string. - The third
movl loads %edx with a numeric value of 5. int $0x80 - 32-bit x86 Linux system call: this calls the kernel. The kernel will look at the registers to know what to do. The kernel first looks at %eax to get the "system call number"; on 32-bit x86, "4" is __NR_write , that is, the write() system call. This call expects three parameters in the %ebx , %ecx and %edx in this order. This is the descriptor of the destination file (here 1: this is standard output), a pointer to the data to write (here is a literal line) and the length of the data to write (here 5, which corresponds to four letters and a new character line). Thus, it writes "test\n" to standard output.- The final
ret returns to the caller. ret pops the value from the stack and goes to that address. This suggests that this piece of code was called using the call operation code.
So, to summarize, the code displays test on a new line.
Try executing it with a custom loader:
#include <unistd.h> #include <fcntl.h> #include <sys/mman.h> int main(void) { void *p; int f; p = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); f = open("test.bin", O_RDONLY); read(f, p, 30); close(f); mprotect(p, 30, PROT_READ | PROT_EXEC); ((void (*)(void))p)(); return 0; }
(The above code does not check the returned values ββfor errors, which is very bad, of course.)
Here I select a memory page (4096 bytes) using mmap() , asking for a page where I can read and write. p points to this piece. Then with open() , read() and close() I read the contents of the test.bin file (30 bytes) into this fragment.
The mprotect() call indicates that the kernel is changing the permissions for my page: now I want to be able to execute these bytes, that is, consider them to be machine code. I waive the right to write to a piece (depending on the specific kernel configuration, the presence of a page that can be written and executed may be prohibited).
Mysterious ((void (*)(void))p)(); reads like this: I accept p ; I passed it as a pointer to a function that takes no arguments and returns nothing; I call this function. This is the C syntax for creating a call to my piece of data.
When I run this program, I get:
$ ./blah test
as expected: the code in test.bin outputs test to standard output.