How to multiply float with int correctly and get a result that is affected only by significant numbers?

I have code that converts between float (representing second) and int64 (representing nanosecond), taking 6 decimal numbers from float

 int64_t nanos = f * 1000000000LL; 

However, many decimal values ​​stored in floats cannot be represented exactly in binary float, so I get results like 14199999488 when my float is 14.2f . I am currently solving this problem by calculating a significant number of digits after the number notation

 const float logOfSecs = std::log10(f); int precommaPlaces = 0; if(logOfSecs > 0) { precommaPlaces = std::ceil(logOfSecs); } int postcommaPlaces = 7 - precommaPlaces; if(postcommaPlaces < 0) { postcommaPlaces = 0; } 

And then print the float in a string so that the Qt around the float is correct. Then I parse the string into a pre- and post-comma integer and several of them with integer arithmetic.

 const QString valueStr = QString::number(f, 'f', postcommaPlaces); qint64 nanos = 0; nanos += valueStr.section(".", 0, 0).toLongLong() * 1000000000LL; if(postcommaPlaces) { nanos += valueStr.section(".", 1).toLongLong() * std::pow(10.0, 9 - postcommaPlaces); } 

This works great, but I was wondering if there is a better, perhaps faster way to do this?

+6
source share
3 answers

If you want to round to one decimal place, for example

 #include <iostream> int main() { float f = 14.2f; long long n = f * 1000000000LL; std::cout << "float: " << n << '\n'; n = (f + 0.05) * 10; n *= 100000000LL; std::cout << "rounded: " << n << '\n'; return 0; } 

With two decimal places, it (f + 0.005) * 100 , ..., and with six decimal places

 n = ((long long)((f + 0.0000005) * 1000000)) * 1000LL; 

If you want to consider significant digits (all digits), you must first take log10(f) and then adjust the rounding of the decimal places.

But since @MarkB already said, if you use int64_t in the first place, you do not need it at all.

+2
source

By storing the value in the float , the damage has already been done, you have lost the original number, whatever it may be. You can guess the value that could have been intended and then rounded, or if you are just trying to display a value to the user, you can round it to a smaller number of decimal places.

Instead, you can solve all these problems using your fixed-point int64_t representation throughout your code base, never converting to / from float and avoiding throwing precision during each conversion.

+3
source

As noted in other answers, rounding to an arbitrary number of decimal digits is closely related to float printing. Since the algorithms that correctly round are quite complex, the easiest way to do this correctly is to use printf.

Please note that you do not need to provide an arbitrary number of digits, an alternative is to use the shortest decimal place, which will be changed without changes in base 2. Such algorithms are used to print float in Scheme, Java, Python, Squeak / Pharo, etc. Unfortunately, neither libm printf nor any of the standard C libraries are compatible.

The scheme is even better because it prints *, where the numbers are not significant when you impose a fixed number of numbers (* means that any number will lead to the same float when converting back to base 2).

In this release http://code.google.com/p/pharo/issues/detail?id=4957 there is an attachment named Float-asMinimalDecimalFraction.st containing an implementation in Smalltalk of a similar printing algorithm than Scheme but it displays a fraction (relation two arbitrary integers), not an ASCII string.

So, for example, despite the fact that 14.2f is represented internally exactly as 14.19999980926513671875, it is not too late, you can get that the shortest decimal fraction that rounds it correctly is (142/10).

Using such code in Smalltalk, solving your problem will be trivial:

 nanos := (floatingPointSeconds asMinimalDecimalFraction * 1e9) rounded. 

But the code above uses exact arithmetic ( 1e9 is an integer) and arbitrary lengths of integers under the hood.

Note that doing multiplication in a float would be bad:

 nanos := (aFloat * 1e9) asMinimalDecimalFraction rounded. 

Indeed, although the 1e9 asFloat conversion is accurate, its value spans 21 bits, so floating point multiplication most likely accumulates rounding errors and worsens the problem of getting a short fraction.

Although I somehow technically answered the question, I would personally consider the above algorithm as pragmatically inappropriate for these reasons:

  • doing this with low-level C / C ++ instructions without the help of an arbitrary precision arithmetic library is not the fastest way to the result

  • it is very limited since it will not be applied to the results of calculations with several rounding errors (they statistically require many digits)

  • this is unnecessary if you can just avoid using float at all and work with nanos int

However, it is always nice to know that it exists ...

0
source

All Articles