Use ternary operator in application function in pandas dataframe, without grouping columns

Question

Use ternary operator in application function in pandas dataframe, without grouping columns

How can I use the ternary operator in a lambda function in applya pandasdataframe function ?

First of all, this code is from R / plyr, which is what I want to get:

ddply(mtcars, .(cyl), summarise, sum(ifelse(carb==4,1,0))/sum(ifelse(carb %in% c(4,1),1,0)))

in the function above, I can use the function ifelse, the three-dimensional operator R, to compute the resulting data frame.

However, when I want to do the same in Python / pandas with the following code

mtcars.groupby(["cyl"]).apply(lambda x: sum(1 if x["carb"] == 4 else 0) / sum(1 if x["carb"] in (4, 1) else 0))

The following error occurs:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

So, how can I calculate and get the same data file as in R / plyr?

For your information, if I use a ternary operator without grouping columns, for example

mtcars.apply(lambda x: sum(1 if x["carb"] == 4 else 0) / sum(1 if x["carb"] in (4, 1) else 0), axis=1)

I can get the resulting data frame for some reason (but that’s not what I wanted to do).

Thanks.

[Update]

, , 1 0, . , R/plyr :

ddply(mtcars, .(cyl), summarise, sum(ifelse(carb==4,6,3))/sum(ifelse(carb %in% c(4,1),8,4)))

?

+4

python pandas

Blaszard 15 . '13 4:23

2

, x['carb'] - numpy ( ). x['carb'] == 4 . True, 4, False . numpy, ( , == ).

.all() :

(x['carb'] == 4).all()

True, (x['carb'] == 4) True.

+1

mgilson 15 . '13 4:32

Roman Pekar · Accepted Answer · 2013-11-15T04:54:35+0000

, :

mtcars.groupby(["cyl"])['carb'].apply(lambda x: sum((x == 4).astype(float)) / sum(x.isin((4, 1))))

:

>>> mtcars = pd.DataFrame({'cyl':[8,8,6,6,6,4], 'carb':[4,3,1,5,4,1]})
>>> mtcars
   carb  cyl
0     4    8
1     3    8
2     1    6
3     5    6
4     4    6
5     1    4
>>> mtcars.groupby(["cyl"])['carb'].apply(lambda x: sum((x == 4).astype(float)) / sum(x.isin((4, 1))))
cyl
4      0.0
6      0.5
8      1.0
dtype: float64

numpy.where():

>>> import numpy as np
>>> mtcars.groupby(["cyl"])['carb'].apply(lambda x: sum(np.where(x == 4,6,3).astype(float)) / sum(np.where(x.isin((4,1)),8,4)))
cyl
4      0.375
6      0.600
8      0.750
dtype: float64

Use ternary operator in application function in pandas dataframe, without grouping columns

More articles: