Can movss be used to replace integer data?

With the restriction that I can only use SSE and SSE2 commands, I need to replace the least significant (0) element of the 4-element vector __m128i with element 0 from another vector.

For floating point vectors, the task is simple: you can use the built-in _mm_move_ss () to force the element to replace element 0 from another vector. It generates one movss command, therefore it is quite efficient.

Using two built-in functions, you can also convince the compiler to use one SSE movss command to move integer data. The source code is as follows:

__m128i NewVector = _mm_castps_si128(_mm_move_ss(_mm_castsi128_ps(Take3FromThisVector), _mm_castsi128_ps(Take1FromThisVector))); 

It looks a little dirty, but with a suitable number of comments it may be acceptable, especially since it generates a minimum of instructions. In its typical use, everything is optimized for storage in hmm registers.

My question is this:

Since this is a movss instruction, where "ss" means a single precision single-point , it is normal for it to move integer data that could potentially contain some "special" or "illegal" (floating point) combo bits in any of vector positions?

The obvious alternative, which I also implemented and tested, is AND the first vector with a mask, then OR in the second vector, which contains only one value in the least significant element, while all the others are zero. As you can imagine, this gives more instructions.

I tried the casting approach, which I showed above, and this does not seem to cause any problems, but I, in particular, note that there is no embedded code that performs the same operation for integer data. It seems that Intel would provide one if it were good for integer data - like _mm_move_epi32 or the like. And so I am skeptical about this, this is a good idea.

I performed some searches, for example, "can the movss command raise a floating point exception", but did not find any information that could answer my question.

Thanks in advance for the knowledge you are willing to share.

-Noel

+5
source share
2 answers

Yes, it's fine to use FP in random order, like movss xmm, xmm for integer data. The insn reference manual states that it cannot raise FP numerical exceptions; only actual FP math instructions do this. So go ahead and drop it.

There is even a bypass delay for using FP shuffle for integer data in most uarches (but there is additional latency for using integer shuffles between mathematical FP instructions).

The Agner Fog Optimization Optimization Guide contains an excellent section on what instructions are useful for different types of data movement (translation, merging, etc.). See also wiki tags for better links.


The reason that there is no integer value is because the integer SSE2 movd returns the top zero bytes of the destination, for example movss , used as a load, but unlike movss between registers.

The Intel vector instruction set, known for its inconsistency and non-orthogonality, especially. earliest versions (e.g. SSE1). SSE4.1 has filled in many gaps, but there are still obvious missing pieces.

+5
source

The types __m128 and __m128i interchangeable. The main reason for an actor is to make your intentions clearer (and keep your compiler happy). The listing itself does not create an additional assembly.

The _mm_move_ss operation _mm_move_ss described directly in terms of which bits end in your result.

If you end up with an invalid bit combination for float with one precision, this will only be a problem if you try to use the resulting value in floating point calculations.

+2
source

All Articles