Why two bitwise or AVX instructions?

AVX has two instructions for bitwise or VORPD and VORPS. The docs say:

VORPD (VEX.256 encoded version) DEST[63:0] <- SRC1[63:0] BITWISE OR SRC2[63:0] DEST[127:64] <- SRC1[127:64] BITWISE OR SRC2[127:64] DEST[191:128] <- SRC1[191:128] BITWISE OR SRC2[191:128] DEST[255:192] <- SRC1[255:192] BITWISE OR SRC2[255:192] 

and

 VORPS (VEX.256 encoded version) DEST[31:0] <- SRC1[31:0] BITWISE OR SRC2[31:0] DEST[63:32] <- SRC1[63:32] BITWISE OR SRC2[63:32] DEST[95:64] <- SRC1[95:64] BITWISE OR SRC2[95:64] DEST[127:96] <- SRC1[127:96] BITWISE OR SRC2[127:96] DEST[159:128] <- SRC1[159:128] BITWISE OR SRC2[159:128] DEST[191:160] <- SRC1[191:160] BITWISE OR SRC2[191:160] DEST[223:192] <- SRC1[223:192] BITWISE OR SRC2[223:192] DEST[255:224] <- SRC1[255:224] BITWISE OR SRC2[255:224] 

Is there any actual difference between these two processor operations? If not: why are there two instructions? Also, if not: is it safe to use them to perform integer bitwise or?

+6
source share
2 answers

As a result of the operation, there is no difference. There are two types of logical consistency, as there are two types of data: one packed (float32) and double packed (float64).

In the case of integers, it doesn't matter which operation you use, it is just compatible with the data type. If you are packing int with a maximum width of 32 bits, use single packed if double packing is more appropriate. Just imagine that this is a throw, you can push a 32-bit int to 64 bits int without loss, but vice versa, if the route to disaster.

+2
source

The presence of PS and PD varieties of all (or almost all) SEE / AVX instructions has a historical context: once, when Intel originally designed the first set of SSE instructions, they thought that future chip architectures would have three Domains: Integer , Floating Point with one Precision (32-bit), Double-precision floating point (64-bit)

Note. Domains are segregated logic blocks in the CPU, and they matter because there is a slight delay in transferring the contents of the SSE / AVX registers between them. Therefore, if the result of a command in an integer area is used as an instruction input in a floating-point domain, 1 or 2 cycles may occur.

For this reason, Intel mirrors logical bitwise and random instructions three times: one for integers, one for SP-FP and one for DP-FP. The operations performed by these mirror instructions are identical, including between integers and varieties with a floating point.

Most x86 architectures currently have two domains: Integer and Floating Point. . In the FP domain, both Single and Double-Precision (32/64 bit) are processed. Some architectures have only one domain for all SSE / AVX instructions. It is conceivable that a third domain for double precision may be added to some future architectures.

+2
source

All Articles