Numpy: check if float array contains integers

In Python, you can check if a float contains an integer value using n.is_integer() based on this QA: How to check if a float value is an integer number .

Does numpy have a similar operation that can be applied to arrays? Something that will allow the following:

 >>> x = np.array([1.0 2.1 3.0 3.9]) >>> mask = np.is_integer(x) >>> mask array([True, False, True, False], dtype=bool) 

You can do something like

 >>> mask = (x == np.floor(x)) 

or

 >>> mask = (x == np.round(x)) 

but they require calling additional methods and creating many temporary arrays that could have been avoided.

Does numpy have a vectorized function that checks for fractional parts of a float in the same way as Python float.is_integer ?

+6
source share
2 answers

From what I can say, there is no function that returns a boolean array indicating whether the float has a fractional part or not. The closest I can find is np.modf , which returns fractional and integer parts, but creates two floating-point arrays (at least temporarily), so it might not be the best in memory size.

If you are happy to work locally, you can try something like:

 >>> np.mod(x, 1, out=x) >>> mask = (x == 0) 

This should save memory against using round or gender (where you should support x ), but of course you lose the original x .

Another option is to ask him to implement in Numpy or to implement it yourself.

+4
source

I needed an answer to this question for a slightly different reason: checking when I can convert the entire array of floating point numbers to integers without data loss.

The answer to Hunse almost works for me, except that I obviously can't use the trick in place, since I need to cancel the operation:

 if np.all(np.mod(x, 1) == 0): x = x.astype(int) 

From there, I thought of the following option, which is probably faster in many situations:

 x_int = x.astype(int) if np.all((x - x_int) == 0): x = x_int 

The reason is that modulo operation is slower than subtraction. However, now we are casting with integers ahead - I do not know how fast this operation is, relatively speaking. But if most of your arrays are integers (they are in my case), the latest version is almost certainly faster.

Another advantage is that you can replace the substrate with something like np.isclose to check within a certain tolerance (of course, you have to be careful here, since truncation is not proper rounding!).

 x_int = x.astype(int) if np.all(np.isclose(x, x_int, 0.0001)): x = x_int 

EDIT: Slower, but it might be worth it, depending on your use case, it will also convert integers individually, if any.

 x_int = x.astype(int) safe_conversion = (x - x_int) == 0 # if we can convert the whole array to integers, do that if np.all(safe_conversion): x = x_int.tolist() else: x = x.tolist() # if there are _some_ integers, convert them if np.any(safe_conversion): for i in range(len(x)): if safe_conversion[i]: x[i] = int(x[i]) 

As an example, where does it matter: it works for me because I have sparse data (which basically means zeros), which I then convert to JSON once and reuse later on the server. For float, ujson converts them as [ ...,0.0,0.0,0.0,... ] , and for ints that result in [...,0,0,0,...] , up to half the number of characters is stored in line. This reduces the overhead both on the server (shorter lines) and on the client (shorter lines seem to be slightly faster than JSON parsing).

+1
source

All Articles