Convert assembly to pseudo code

I am working on a homework project involving a "bomb" written in a collection with which I have to redesign to come up with 5 lines that would disarm each of the five "phases" of the bomb. I am stuck in the third phase right now, trying to translate the assembly (x86, AT & T syntax that I believe) created by gdb for this function. What I have managed to find out so far is that he is trying to take a string of six numbers as user input and judge them by some criteria, but this is where I lose it. The function is as follows (with my attempt to translate the pseudocode next to it).

0x08048816 <phase_3+0>: push %ebp 0x08048817 <phase_3+1>: mov %esp,%ebp 0x08048819 <phase_3+3>: push %edi 0x0804881a <phase_3+4>: push %ebx 0x0804881b <phase_3+5>: sub $0x30,%esp 0x0804881e <phase_3+8>: lea -0x24(%ebp),%eax 0x08048821 <phase_3+11>: mov %eax,0x4(%esp) 0x08048825 <phase_3+15>: mov 0x8(%ebp),%eax 0x08048828 <phase_3+18>: mov %eax,(%esp) 0x0804882b <phase_3+21>: call 0x8048d2c <read_six_numbers> 0x08048830 <phase_3+26>: mov -0x24(%ebp),%eax eax = p1 0x08048833 <phase_3+29>: cmp $0x1,%eax if eax != 1 0x08048836 <phase_3+32>: je 0x804883d <phase_3+39> explode bomb 0x08048838 <phase_3+34>: call 0x8048fec <explode_bomb> else 0x0804883d <phase_3+39>: movl $0x1,-0xc(%ebp) ebp[-12] = 1 0x08048844 <phase_3+46>: jmp 0x804888a <phase_3+116> while ebp[-12] < 5 { 0x08048846 <phase_3+48>: mov -0xc(%ebp),%eax eax = ebp[-12] 0x08048849 <phase_3+51>: mov -0x24(%ebp,%eax,4),%eax {magic} 0x0804884d <phase_3+55>: mov %eax,%ebx ebx = eax 0x0804884f <phase_3+57>: mov -0xc(%ebp),%eax eax = ebp[-12] 0x08048852 <phase_3+60>: sub $0x1,%eax eax -= 1 0x08048855 <phase_3+63>: mov -0x24(%ebp,%eax,4),%eax {magic} 0x08048859 <phase_3+67>: mov %eax,%edx edx = eax 0x0804885b <phase_3+69>: mov 0x804a6d8,%eax eax = 0x804a6d8 0x08048860 <phase_3+74>: mov $0xffffffff,%ecx ecx = 255 0x08048865 <phase_3+79>: mov %eax,-0x2c(%ebp) ebp[-12] = eax 0x08048868 <phase_3+82>: mov $0x0,%eax eax = 0 0x0804886d <phase_3+87>: cld 0x0804886e <phase_3+88>: mov -0x2c(%ebp),%edi edi = ebp[-12] 0x08048871 <phase_3+91>: repnz scas %es:(%edi),%al {deep magic} 0x08048873 <phase_3+93>: mov %ecx,%eax eax = ecx 0x08048875 <phase_3+95>: not %eax eax = -eax 0x08048877 <phase_3+97>: sub $0x1,%eax eax -= 1 0x0804887a <phase_3+100>: imul %edx,%eax eax *= edx 0x0804887d <phase_3+103>: cmp %eax,%ebx if (eax != ebx) 0x0804887f <phase_3+105>: je 0x8048886 <phase_3+112> explode_bomb 0x08048881 <phase_3+107>: call 0x8048fec <explode_bomb> else 0x08048886 <phase_3+112>: addl $0x1,-0xc(%ebp) ebp[-12] += 1 0x0804888a <phase_3+116>: cmpl $0x5,-0xc(%ebp) 0x0804888e <phase_3+120>: jle 0x8048846 <phase_3+48> } 0x08048890 <phase_3+122>: add $0x30,%esp 0x08048893 <phase_3+125>: pop %ebx 0x08048894 <phase_3+126>: pop %edi 0x08048895 <phase_3+127>: pop %ebp 0x08048896 <phase_3+128>: ret 

At least I am a little (though not very) sure of most of this; the lines that I'm absolutely sure are wrong, these are the three lines that are currently designated as β€œmagic” - phase_3 + 51, phase_3 + 63 and phase_3 + 91 (two moving lines with strange syntax and repnz). I have not seen the syntax around, and I cannot figure out which search terms to use to find them.

Any general (and / or caustic) criticism of my attempt at this? Obvious places where I'm going off the rails? Obviously, since this is homework, I don’t need someone to give me an answer; I just want to know if my interpretation is educational (and that these three lines mean that I'm puzzled).

Thanks so much for any help!

* EDIT ***

The read_six_numbers function decomposes as follows:

 0x08048d2c <read_six_numbers+0>: push %ebp 0x08048d2d <read_six_numbers+1>: mov %esp,%ebp 0x08048d2f <read_six_numbers+3>: push %esi 0x08048d30 <read_six_numbers+4>: push %ebx 0x08048d31 <read_six_numbers+5>: sub $0x30,%esp 0x08048d34 <read_six_numbers+8>: mov 0xc(%ebp),%eax 0x08048d37 <read_six_numbers+11>: add $0x14,%eax 0x08048d3a <read_six_numbers+14>: mov 0xc(%ebp),%edx 0x08048d3d <read_six_numbers+17>: add $0x10,%edx 0x08048d40 <read_six_numbers+20>: mov 0xc(%ebp),%ecx 0x08048d43 <read_six_numbers+23>: add $0xc,%ecx 0x08048d46 <read_six_numbers+26>: mov 0xc(%ebp),%ebx 0x08048d49 <read_six_numbers+29>: add $0x8,%ebx 0x08048d4c <read_six_numbers+32>: mov 0xc(%ebp),%esi 0x08048d4f <read_six_numbers+35>: add $0x4,%esi 0x08048d52 <read_six_numbers+38>: mov %eax,0x1c(%esp) 0x08048d56 <read_six_numbers+42>: mov %edx,0x18(%esp) 0x08048d5a <read_six_numbers+46>: mov %ecx,0x14(%esp) 0x08048d5e <read_six_numbers+50>: mov %ebx,0x10(%esp) 0x08048d62 <read_six_numbers+54>: mov %esi,0xc(%esp) 0x08048d66 <read_six_numbers+58>: mov 0xc(%ebp),%eax 0x08048d69 <read_six_numbers+61>: mov %eax,0x8(%esp) 0x08048d6d <read_six_numbers+65>: movl $0x804965d,0x4(%esp) 0x08048d75 <read_six_numbers+73>: mov 0x8(%ebp),%eax 0x08048d78 <read_six_numbers+76>: mov %eax,(%esp) 0x08048d7b <read_six_numbers+79>: call 0x80485a4 < sscanf@plt > 0x08048d80 <read_six_numbers+84>: mov %eax,-0xc(%ebp) 0x08048d83 <read_six_numbers+87>: cmpl $0x5,-0xc(%ebp) 0x08048d87 <read_six_numbers+91>: jg 0x8048d8e <read_six_numbers+98> 0x08048d89 <read_six_numbers+93>: call 0x8048fec <explode_bomb> 0x08048d8e <read_six_numbers+98>: add $0x30,%esp 0x08048d91 <read_six_numbers+101>: pop %ebx 0x08048d92 <read_six_numbers+102>: pop %esi 0x08048d93 <read_six_numbers+103>: pop %ebp 0x08048d94 <read_six_numbers+104>: ret 
+6
source share
1 answer
 mov -0x24(%ebp,%eax,4),%eax 

The above statement refers to an array element. This is called SIB addressing in x86, for scale, index, base. There is also an Offset component. The array is based on the address specified by the base register ( EBP here) plus the offset (when using the frame pointer, local variables, including arrays, are treated as the offset from the frame pointer). The item number is in the index register ( EAX here). The size of each element is determined by the scale ( 4 here).

 mov 0x804a6d8,%eax mov $0xffffffff,%ecx mov %eax,-0x2c(%ebp) mov $0x0,%eax cld mov -0x2c(%ebp),%edi repnz scas %es:(%edi),%al mov %ecx,%eax not %eax sub $0x1,%eax 

This is just strlen(0x805a6d8) . ES:EDI points to the line to scan (compare the repeated byte of the link) at 0x804a6d8 . AL contains the character to scan: 0 - ASCII NUL . cld sets the scanning direction: ascending ( std decreases scanning). ECX initialized to ~0 = -1 : all bits 1. repnz repeats the scas command ( repnz SCAN) decrementing ECX , while ECX not equal to zero (which will not happen because ECX is large enough to prevent this), and scanning failed (NZ, while scanning (comparison between string and reference AL) did not set a zero flag). After that, ECX contains -1-(steps in the scan) . NOT does this (steps in the scan) . SUB does this (steps in the scan) - 1 = (length of string not including the terminating NUL) . Also explained at http://www.int80h.org/strlen/ .

+7
source

All Articles