Integer overflow in numpy arrays

import numpy as np a = np.arange(1000000).reshape(1000,1000) print(a**2) 

With this code, I get this answer. Why am I getting negative values?

 [[ 0 1 4 ..., 994009 996004 998001] [ 1000000 1002001 1004004 ..., 3988009 3992004 3996001] [ 4000000 4004001 4008004 ..., 8982009 8988004 8994001] ..., [1871554624 1873548625 1875542628 ..., -434400663 -432404668 -430408671] [-428412672 -426416671 -424420668 ..., 1562593337 1564591332 1566589329] [1568587328 1570585329 1572583332 ..., -733379959 -731379964 -729379967]] 
+7
python numpy
source share
4 answers

On your platform, np.arange returns a dtype array of 'int32':

 In [1]: np.arange(1000000).dtype Out[1]: dtype('int32') 

Each element of the array is a 32-bit integer. Squaring produces a result that does not match 32 bits. The result is truncated to 32 bits and is still interpreted as a 32-bit integer, so you see negative numbers.

Edit: In this case, you can avoid integer overflow by building a dtype 'int64' array before squaring:

 a=np.arange(1000000,dtype='int64').reshape(1000,1000) 

Please note that the problem you have found is an inherent threat when working with numpy. You should carefully choose your data types and know in advance that your code will not lead to arithmetic overflows. For speed, numpy cannot and will not warn you when this happens.

See http://mail.scipy.org/pipermail/numpy-discussion/2009-April/041691.html for a discussion of this issue on the numpy mailing list.

+13
source share

The python integers do not have this problem, as they automatically update long python integers on overflow.

so if you manage to overflow int64, one solution is to use python int in the numpy array:

 import numpy a=numpy.arange(1000,dtype=object) a**20 
+4
source share

numpy integer is a fixed width, and you see the results of integer overflows.

+2
source share

The solution to this problem is as follows (taken from here ):

... change in the StringConverter._mapper class (numpy / lib / _iotools.py) from:

 {{{ _mapper = [(nx.bool_, str2bool, False), (nx.integer, int, -1), (nx.floating, float, nx.nan), (complex, _bytes_to_complex, nx.nan + 0j), (nx.string_, bytes, asbytes('???'))] }}} 

to

 {{{ _mapper = [(nx.bool_, str2bool, False), (nx.int64, int, -1), (nx.floating, float, nx.nan), (complex, _bytes_to_complex, nx.nan + 0j), (nx.string_, bytes, asbytes('???'))] }}} 

This resolved a similar problem that I had with numpy.genfromtxt for me

Please note that the author describes this as a “temporary” and “not optimal” solution. However, I had no side effects using v2.7 (else ?!).

0
source share

All Articles