SSE rounds when it should round

I am working on an application that converts Float samples in the range -1.0 to 1.0 to 16 bits to ensure that the output of the optimized routines (SSE) is correct. I wrote a test suite, an optimized version against the SSE version and compares their output.

Before starting, I confirmed that the SSE rounding mode is set to the closest.

In my test example, the formula is:

ratio = 65536 / 2 output = round(input * ratio) 

For the most part, the results are accurate, but on one particular input, I see a failure to enter -0.8499908447265625 .

 -0.8499908447265625 * (65536 / 2) = -27852.5 

Normal code correctly rounds this value to -27853 , but SSE code rounds it to -27852 .

The SSE code is used here:

 void Float_S16(const float *in, int16_t *out, const unsigned int samples) { static float ratio = 65536.0f / 2.0f; static __m128 mul = _mm_set_ps1(ratio); for(unsigned int i = 0; i < samples; i += 4, in += 4, out += 4) { __m128 xin; __m128i con; xin = _mm_load_ps(in); xin = _mm_mul_ps(xin, mul); con = _mm_cvtps_epi32(xin); out[0] = _mm_extract_epi16(con, 0); out[1] = _mm_extract_epi16(con, 2); out[2] = _mm_extract_epi16(con, 4); out[3] = _mm_extract_epi16(con, 6); } } 

Do-it-yourself example on request:

 /* standard math */ float ratio = 65536.0f / 2.0f; float in [4] = {-1.0, -0.8499908447265625, 0.0, 1.0}; int16_t out[4]; for(int i = 0; i < 4; ++i) out[i] = round(in[i] * ratio); /* sse math */ static __m128 mul = _mm_set_ps1(ratio); __m128 xin; __m128i con; xin = _mm_load_ps(in); xin = _mm_mul_ps(xin, mul); con = _mm_cvtps_epi32(xin); int16_t outSSE[4]; outSSE[0] = _mm_extract_epi16(con, 0); outSSE[1] = _mm_extract_epi16(con, 2); outSSE[2] = _mm_extract_epi16(con, 4); outSSE[3] = _mm_extract_epi16(con, 6); printf("Standard = %d, SSE = %d\n", out[1], outSSE[1]); 
+7
c ++ x86 sse intrinsics rounding-error
source share
2 answers

Although the SSE rounding mode defaults to β€œround to closest,” this is not an old familiar rounding method that we all learned at school, but a slightly more modern change known as Banker rounding (also known as objective rounding, convergent rounding) , statistical rounding, Dutch rounding, Gaussian rounding or odd-even rounding), which is rounded to the nearest even integer value. This rounding method is probably better than the more traditional method from a statistical point of view. You will see the same behavior with features such as rint () , as well as the default rounding mode for IEEE-754 .

Note also that while the standard library function round () uses the traditional rounding method, the SSE ROUNDPS ( _mm_round_ps ) _mm_round_ps uses bankers rounding.

+13
source share

This is the default behavior for handling all floating point, not just SSE. Rounding to a round or round can is the default rounding mode according to IEEE 754.

The reason for this is that sequential rounding (or reduction) leads to a half-point error, which accumulates when applied even with a moderate number of operations. Half of the points can lead to some fairly significant errors - significant enough to become the point of the chart in Superman 3.

A half to even or odd round, however, leads to negative and positive half-point errors that eliminate each other when applied in many operations.

This is also desirable in SSE operations. SSE operations are usually used for processing signals (audio, images), engineering and statistical scenarios, where a constant rounding error appears as noise and requires additional processing to be removed (if possible). The rounding of the bankers ensures that this noise is eliminated.

+7
source share

All Articles