I'm trying my best to optimize the code that I have using Microsoft built-in tools. One of the biggest problems when optimizing my code is the LHS, which occurs when I want to use a constant. It seems that there is some information about the generation of certain constants ( here and here - section 13.4 ), but all its assemblies (which I would prefer to avoid).
The problem is that I am trying to implement the same thing with intrinsics, msvc complains about incompatible types, etc. Does anyone know of any equivalent tricks using intrinsics?
Example - Create {1.0,1.0,1.0,1.0}
//pcmpeqw xmm0,xmm0 __m128 t = _mm_cmpeq_epi16( t, t ); //pslld xmm0,25 _mm_slli_epi32(t, 25); //psrld xmm0,2 return _mm_srli_epi32(t, 2);
This generates a bunch of errors about an incompatible type (__m128 vs _m128i). I'm new to this, so I'm pretty sure I'm missing something obvious. Can anyone help?
TL; DR . How can I generate a __m128 vec filled with single point constant floats with ms internal functions?
Thanks for reading:)
Jbefat
source share