Pandas Application DataFrame

I have a Pandas DataFrame with four columns, A, B, C, D It turns out that sometimes the values โ€‹โ€‹of B and C can be 0 . Therefore, I want to get the following:

 B[i] = B[i] if B[i] else min(A[i], D[i]) C[i] = C[i] if C[i] else max(A[i], D[i]) 

where i used i to indicate the mileage on all lines of the frame. Pandas makes it easy to find rows containing null columns:

 df[df.B == 0] and df[df.C == 0] 

however, I have no idea how easy it is to perform the above conversion. I can think of various ineffective and inelegant methods ( for loops throughout the frame), but nothing simple.

+7
source share
2 answers

The combination of boolean indexing and application can do the trick. The following is an example of replacing a null entry for column C.

 In [22]: df Out[22]: ABCD 0 8 3 5 8 1 9 4 0 4 2 5 4 3 8 3 4 8 5 1 In [23]: bi = df.C==0 In [24]: df.ix[bi, 'C'] = df[bi][['A', 'D']].apply(max, axis=1) In [25]: df Out[25]: ABCD 0 8 3 5 8 1 9 4 9 4 2 5 4 3 8 3 4 8 5 1 
+8
source

Try the DataFrame iterrows class method to efficiently iterate through the DataFrame strings. See chapter 6.7.2 of pandas 0.8.1 manual.

 from pandas import * import numpy as np df = DataFrame({'A' : [5,6,3], 'B' : [0,0,0], 'C':[0,0,0], 'D' : [3,4,5]}) for idx, row in df.iterrows(): if row['B'] == 0: row['B'] = min(row['A'], row['D']) if row['C'] == 0: row['C'] = min(row['A'], row['D']) 
+2
source

All Articles