Multiplying two positive numbers gives a negative result in Python 3

I have a DataFrame df1 :

 df1.head() = wght num_links id_y id_x 3 133 0.000203 2 186 0.000203 2 5 6 0.000203 2 98 0.000203 2 184 0.000203 2 

I need to compute a variable called thr ,

 thr = N*(N-1)*2, 

where N is the number of lines df1 .

The problem is that when I calculate thr , Python returns a negative value (although all inputs are positive):

 ipdb> df1['wght'].count()*(df1['wght'].count()-1)*2 -712569744 

Possible hint

The number of rows N is

 ipdb> df1['wght'].count() 137736 

hence,

 ipdb> 137736*137735*2 37942135920. 

Whereas the maximum value that can be assigned to int32 is 2147483647 , I suspect that NumPy considers type(thr) = <int32> when it should be <int64> . Does this make sense?

Please note that I did not write the code that df1 generates, because

 ipdb> df1['wght'].count() 137736 

However, if necessary to reproduce the error, let me know.

Thanks in advance.

+6
source share
3 answers

You are experiencing np.int32 overflow, so use len(df) instead of df.column.count() .

Here is a small demo:

 In [149]: x = pd.DataFrame(np.random.randint(0,100,size=(137736, 3)), columns=list('ABC')) In [150]: xAcount() * (xAcount() - 1) * 2 Out[150]: -712569744 In [151]: len(x) * (len(x) - 1) * 2 Out[151]: 37942135920 In [153]: type(xAcount()) Out[153]: numpy.int32 In [154]: type(len(x)) Out[154]: int 
+6
source

If you get the type count() (i.e. type(df1['wght'].count()) ), you will get:

 <class 'numpy.int32'> 

So try your calculations with:

 n = df1['wght'].count().astype(np.int64) n*(n-1)*2 
+2
source

You can pass df1['wght'].count() long constructor so that it is long.

 N = long(df1['wght'].count()) 

Although saving any variable

 N = df1['wght'].count() 

should work, since the int class has a __mul__ method (which implements *), which if necessary creates a long result.

Also, Python 3.x has a "unified" int and long, which also takes care of the error.

+1
source

All Articles