Same assembly instruction, but different machine instruction

I play with x86 ISA, when I tried to use nasm to convert some assembly instructions into machine instructions, I found something interesting.

mov [0x3412],al mov [0x3412], bl mov [0x3412], cl mov [0x3412], dl 1 00000000 A21234 mov [0x3412], al 2 00000003 881E1234 mov [0x3412], bl 3 00000007 880E1234 mov [0x3412], cl 4 0000000B 88161234 mov [0x3412], dl 

As you can see, mov [0x3412], al is an exception to the rule. In addition, I found that mov [0x3412], al displays two different machine instructions.

 root@localhost :~/asm$ ndisasm 123 00000000 88061234 mov [0x3412],al 00000004 A21234 mov [0x3412],al 

Besides this special instruction, is there any other assembly instruction for more than one machine instruction in x86?

+5
source share
1 answer

What you are observing is an artifact of one of the design considerations that Intel made with the 8088 processor. To remain compatible with the 8088 processor, x86-based processors today promote some of these design considerations, especially with regard to the instruction set. In particular, Intel decided that the 8088 should be more efficient using memory at the expense of performance. They created a CISC command set of variable length, which has some special encodings to limit the size of some commands. This differs from many RISC-based architectures (for example, the earlier Motorola 88000), which used fixed-length instructions, but which could provide better performance.

The trade-off between speed and a set of variable or fixed-length instructions was that the processor needed more time to decode complex variable-length instructions, which are used to achieve some of the smaller instruction encodings. This was true for the Intel 8088.

In earlier literature (circa 1980), considerations for achieving better use of space were much more pronounced. The information in my answer related to the AX register is taken from a book on my shelf called Programming in Assembler Language 8088: IBM PC , however some information can be found in network articles such as this .

From an online article, this information is very applicable to the situation with the AX (battery) and other general purpose registers such as BX, CX, DX.

AX is a "battery";

some operations, such as MUL and DIV, require one of the operands to be in the battery. Some other operations, such as ADD and SUB, can be applied to any of the registers (that is, to any of the eight general and special registers), but are more effective when working with the battery.

BX is the "base '' register;

it is the only general- destination register that can be used for indirect addressing. For example, the MOV [BX], AX instruction saves the contents of AX in a memory location whose address is specified in BX.

CX is the "count '' register.

Loop instructions (LOOP, LOOPE, and LOOPNE), shift and turn instructions (RCL, RCR, ROL, ROR, SHL, SHR, and SAR) and string instructions (with the REP, REPE, and REPNE prefixes) β€”all use the count register to determine how many times they will be repeated.

DX is the "data '' register;

it is used together with AX for word-size MUL and DIV operations, and it can also contain a port number for IN and OUT commands, but it is mostly available as a convenient place to store data, like all other general destination registers.

As you can see, Intel intended to use general purpose registers for various purposes, however they could also be used for specific purposes and often had special significance for the instructions with which they were associated. In your case, you observe the fact that the AX is considered a battery . Intel took this into account and for a number of instructions added special operation codes to more efficiently store the complete instruction. You found this using the MOV instruction (with AX, AL), but this also applies to ADC , ADD , AND , CMP , OR , SBB , SUB , TEST , XOR . Each of these instructions has a shorter encoding of the opcode when used with AL, AX, which requires one less byte. You can also encode AX, AL with longer operation codes. In your case:

 00000000 88061234 mov [0x3412],al 00000004 A21234 mov [0x3412],al 

This is the same instruction, but with two different encodings.

This is a good reference to the HTML x86 instruction set available on the Internet, but Intel provides a very detailed reference to instructions for IA-32 (i386, etc.) and 64-bit architectures.

+11
source

All Articles