Can someone explain this assembler code

Question

Can someone explain this assembler code

This is the shellcode for exploiting the bufferoverflow vulnerability. this sets setuid (0) and spawns a shell using execve ()

xor% ebx,% ebx * /
lea 0x17 (% ebx),% eax * /
int $ 0x80 * /
push% ebx * /
push $ 0x68732f6e * /
push $ 0x69622f2f * /
mov% esp,% ebx * /
push% eax * /
push% ebx * /
mov% esp,% ecx * /
cltd * /
mov $ 0xb,% al * /
int $ 0x80

this is how I interpreted it ...

Xoring to create an ebx value of 0
adds 23 to 0 and loads efficient addr in eax. for setuid ()

3.interrupt

4.push ebx

5.push address // y is this address only ????

6.push address // same question

12 mov execve sys call in al

13 interruption

Can someone explain all the steps clearly?

+4

assembly buffer-overflow exploit

Vinod k Nov 08 '10 at 18:05

source share

1 answer

Thomas pornin · Accepted Answer · 2010-11-09T14:23:39+0000

int is the operation code to trigger a software interrupt. Software interrupts are numbered (from 0 to 255) and processed by the kernel. On Linux systems, interrupt 128 (0x80) is the usual entry point for system calls. The kernel expects system call arguments in registers; in particular, the% eax register determines which system call we are talking about.

Set% ebx to 0
Compute% ebx + 23 and save the result in% eax (operation code lea as "effective load address", but memory access is not involved, this is just an insidious way to make an addition).
System call. % eax contains 23, which means the setuid system call. This system call uses one argument (target UID), which can be found in% ebx, which conveniently contains 0 at this point (it was set in the first instruction). Note: after return, the registers are not changed, with the exception of% eax, which contains the return value of the system call, usually 0 (if the call was successful).
Push% ebx onto the stack (which is still 0).
Push $ 0x68732f6e onto the stack.
Push $ 0x69622f2f onto the stack. As the stack grows “down” and since x86 processors use small endian coding, the effect of instructions 4 through 6 is that% esp (stack pointer) now points to a sequence of twelve bytes, values 2f 2f 62 69 6e 2f 73 68 00 00 00 00 (in hexadecimal). This is the encoding of the string // bin / sh (with a terminating zero and three additional zeros afterwards).
Move% esp to% ebx. Now% ebx contains a pointer to the string "// bin / sh", which was built above.
Push% eax on the stack (% eax - 0 at this point, this is the return status from setuid ).
Press% ebx on the stack (pointer to "// bin / sh"). Instructions 8 and 9 build an array of two pointers on the stack, the first of which is a pointer to "// bin / sh", and the second to a NULL pointer. This array is what the execve system call will use as the second argument.
Move% esp to% ecx. Now% ecx points to an array built with instructions 8 and 9.
Sign-extend% eax to% edx:% eax. cltd is the AT & T syntax for Intel documentation calling cdq . Since% eax is zero at this point, this also sets% edx.
Set% al (the least significant byte of% eax) to 11. Since% eax is zero, the entire value of% eax is now 11.
System call. The value% eax (11) identifies the system call as execve . execve expects three arguments, in% ebx (a pointer to a line, naming the executable file),% ecx (a pointer to an array of pointers to lines that are the arguments of the program, the first of which is a copy of the name of the program that will be used by the called program) and% edx (a pointer to an array of pointers to strings that are environment variables; Linux allows this value to be NULL for an empty environment) respectively.

So, the code first calls setuid(0) , then calls execve("//bin/sh", x, 0) , where x points to an array of two pointers, the first of which is a pointer to "// bin / sh", and the other is NULL.

This code is rather confusing because it wants to avoid zeros: when assembling into binary operation codes, the sequence of instructions uses only non-zero bytes. For example, if the 12th instruction was movl $0xb,%eax (setting all% eax to 11), then the binary representation of this operation code would contain three bytes of the value 0. The absence of zero makes this sequence suitable for use as the contents of string C with zero completion. This, of course, is intended to attack buggy programs through buffer overflows.

Can someone explain this assembler code

More articles: