Endianness inside CPU registers

I need help understanding endianness inside the x86 processor's processor registers. I wrote this little build program:

section .data section .bss section .text global _start _start: nop mov eax, 0x78FF5ABC mov ebx,'WXYZ' nop ; GDB breakpoint here. mov eax, 1 mov ebx, 0 int 0x80 

I ran this program in GDB with a breakpoint on line 10 (commented in the source above). At this info registers breakpoint, the values eax=0x78ff5abc and ebx=0x5a595857 .

Since the ASCII codes for W, X, Y, Z are 57, 58, 59, 5A, respectively; and intel is little endian, 0x5a595857 seems to be the correct byte order (least significant byte). Why then is there no output for eax register 0xbc5aff78 (the least significant byte of the number 0x78ff5abc) instead of 0x78ff5abc ?

+5
x86 cpu-registers endianness
Dec 21 '10 at 23:01
source share
3 answers

Endianness only makes sense for memory, where each byte has a numeric address. When an MSByte value is placed in a higher memory address than LSByte, it is called Littte endian, and this is a statement for any x86 processor.

While for integers, the difference between LSByte and MSByte is clear:

  0x12345678 MSB---^^ ^^---LSB 

It is not defined for string literals! It is not obvious which part of WXYZ should be considered LSB or MSB:

1) The most obvious way:

 'WXYZ' -> 0x5758595A 

will result in ZYXW memory ZYXW .

2) It is not so obvious when the order of memory should correspond to the order of literals:

 'WXYZ' -> 0x5A595857 

The assembler must choose one of them, and, apparently, he chooses the second.

+5
Dec 22 '10 at 2:45
source share

Endianness is not meaningful inside a register, because endianness describes whether the byte order is from low to high memory address or from high to low memory address. Registers are not addressed by bytes, so there is no low or high address in the register. What you see is how your debugger prints data.

+14
Dec 21 '10 at 23:19
source share

Assembler handles two constants differently. Internally, the value in the EAX register is stored in big-endian format. You can see this by writing:

 mov eax, 1 

If you check the register, you will see that its value is 0x00000001 .

When you tell the assembler that you want a constant value of 0x78ff5abc , this is exactly what is stored in the register. The high 8 bits of EAX will contain 0x78 , and the AL register will contain 0xbc .

Now, if you want to save the value from EAX into memory, it will be laid out in memory in the reverse order. That is, if you were to write:

 mov [addr],eax 

And then the checked memory in [addr], you will see 0xbc, 0x5a, 0xff, 0x78.

In the case of "WXYZ", the assembler assumes that you want to load a value so that if you were to write it to memory, it would be laid out as 0x57, 0x58, 0x59, 0x5a.

Take a look at the bytes of code that the assembler generates and you will see the difference. In the case of mov eax,0x78ff5abc you will see:

 <opcodes for mov eax>, 0xbc, 0x5a, 0xff, 0x78 

In the case of mov eax,WXYZ you will see:

 <opcodes for mov eax>, 0x57, 0x58, 0x59, 0x5a 
+9
Dec 22
source share



All Articles