Label deletion in x86 assembly

Consider this x86 assembler code:

section .data foo: mov ebx, [boo] mov [goo], ebx goo: mov eax, 2 mov eax, 3 ret boo: mov eax, 4 mov eax, 5 ret 

What exactly is going on here? When I play [boo] and mov it on [goo] , what exactly do I go there? Only one team? ret also?


Follow up questions:

  • Does dereferencing tags give me an address? Or machine code for the first command in the label?
  • If this is machine code, how can it be multiple commands? Are all the commands essentially 32-bit (even if not all the bits are used)?
  • Bottom line - will eax have a value of 3 or 5 at the end?
+6
assembly x86
source share
3 answers

boo is the offset of the mov eax, 3 instruction inside the .data section. mov ebx, [boo] means "fetch four bytes with the offset specified by boo inside ebx ". Similarly, mov [goo], ebx will move the contents of ebx with the offset indicated by goo .

However, the code is often read-only, so it would not be surprising to see that the code just crashes.

Here's how the instructions are encoded in boo :

 boo: b8 03 00 00 00 mov eax,0x3 c3 ret 

So, you get ebx actually 4/5 instructions mov eax, 3 .

+9
source share

The first mov copies from the offset goo relative to the register of the [e] DS segment. The second mov writes with offset foo to the data location relative to the DS register. If CS and DS match, this can be ignored. Assuming CS and DS are the same, you are likely to come across various security mechanisms that display read-only sections of code.

RE followups:

  • The shortcut does not look like a link - you are not acting out as such. Assembler replaces a number representing the location in the resulting code. You can upload either an address or a thing at. [AND] indicate dereferencing - I fixed a confused element in my first answer to cover it. IOW does [goo] download the thing at this address.
  • A set of CISC commands, such as x86, has [very] variable-length instructions - some not even multiple of the word length. RISCs usually try to do this to simplify decoding instructions.
  • 3 - you only modify the first 4 bytes of mov eax, 2 (which, due to the small endian encoding, are replaced by 4, but then overwritten by the next instruction, which was not changed at all - 5 is never in the picture as a candidate (I thought you think that the code is reordering the way you first asked the question [1], although you obviously know a little more, as I should have guessed from your representative: P)]).

Note that all of this assumes that CS = DS and DEP are not included.

Also, if you used BX instead of EBX, the game will include the things you expected (using xX instead of ExX, accesses low 2 bytes of the register [and xL accesses the low byte])

[1] Remember, assembler is just a tool for writing opcodes - such as shortcuts, etc., which all boil down to numbers, etc. with very little magic or impressive code transformations - there are no closures or something so deep nature is hiding there. (This simplifies things a bit - code can be relocatable, and in many cases corrections apply to using offsets with a combination of linker and loader)

+3
source share

The following answers:

  • It gives you machine code starting with an address. How much this depends on the length of your load, in this case it is 4 bytes.

  • It can be several teams or just a fragment of a team. In this architecture (Intel x86), machine code instructions are between 8 and 120 bits.

  • 3.

+2
source share

All Articles