Calling functions with multiple arguments when using Groupby

When writing functions to be used with groupby.apply or groupby.transform in pandas, if functions have several arguments, then when calling a function as part of a group, the arguments follow a comma, and not in parentheses. An example is:

def Transfunc(df, arg1, arg2, arg2):
     return something

GroupedData.transform(Transfunc, arg1, arg2, arg3)

Where the df argument is automatically passed as the first argument.

However, the same syntax is not possible when using a function to group data. Take the following example:

people = DataFrame(np.random.randn(5, 5), columns=['a', 'b', 'c', 'd', 'e'], index=['Joe', 'Steve', 'Wes', 'Jim', 'Travis'])
people.ix[2:3, ['b', 'c']] = NA

def MeanPosition(Ind, df, Column):
    if df[Column][Ind] >= np.mean(df[Column]):
        return 'Greater Group'
    else:
        return 'Lesser Group'
# This function compares each data point in column 'a' to the mean of column 'a' and return a group name based on whether it is greater than or less than the mean

people.groupby(lambda x: MeanPosition(x, people, 'a')).mean()

The above works fine, but I can't figure out why I need to wrap a function in lambda. Based on the syntax used in the conversion and application, it seems to me that the following should work just fine:

people.groupby(MeanPosition, people, 'a').mean()

- , , ?

EDIT: , , , . , , . :

def MeanPositionList(df, Column):
    return ['Greater Group' if df[Column][row] >= np.mean(df[Column]) else 'Lesser Group' for row in df.index]

Grouped = people.groupby(np.array(MeanPositionList(people, 'a')))
Grouped.mean()

, , .

+4
1

apply , apply .

groupby , . , ; / .

, , , ( , )

In [22]: def f(x):
   ....:     result = Series('Greater',index=x.index)
   ....:     result[x<x.mean()] = 'Lesser'
   ....:     return result
   ....: 

In [25]: df = DataFrame(np.random.randn(5, 5), columns=['a', 'b', 'c', 'd', 'e'], index=['Joe', 'Joe', 'Wes', 'Wes', 'Travis'])

In [26]: df
Out[26]: 
               a         b         c         d         e
Joe    -0.293926  1.006531  0.289749 -0.186993 -0.009843
Joe    -0.228721 -0.071503  0.293486  1.126972 -0.808444
Wes     0.022887 -1.813960  1.195457  0.216040  0.287745
Wes    -1.520738 -0.303487  0.484829  1.644879  1.253210
Travis -0.061281 -0.517140  0.504645 -1.844633  0.683103

In [27]: df.groupby(df.index.values).transform(f)
Out[27]: 
              a        b        c        d        e
Joe      Lesser  Greater   Lesser   Lesser  Greater
Joe     Greater   Lesser  Greater  Greater   Lesser
Travis  Greater  Greater  Greater  Greater  Greater
Wes     Greater   Lesser  Greater   Lesser   Lesser
Wes      Lesser  Greater   Lesser  Greater  Greater
+3

All Articles