How to use argmin with groupby in pandas

Suppose I have a pandas dataframe:

  cat  val
0   a    1
1   a    6
2   a   12
3   b    2
4   b    5
5   b   11
6   c    4
7   c   22

And I want to know, for each category (each value of "cat"), what is the position where the value is closest to the given value, say 5.5. I can subtract my target value and take an absolute value, giving me something like this:

  cat  val  val_delt
0   a    1       4.5
1   a    6       0.5
2   a   12       6.5
3   b    2       3.5
4   b    5       0.5
5   b   11       5.5
6   c    4       1.5
7   c   22      16.5

But I'm stuck where to go next. My first thought was to use argmin () with groupby (), but this gives an error:

In [375]: df.groupby('cat').val_delt.argmin()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-375-a2c3dbc43c50> in <module>()
----> 1 df.groupby('cat').val_delt.argmin()

TypeError: 'Series' object is not callable

Of course, I could come up with some terrible hacker thing in standard python, where I iterate over all the cat values, then select a subset of my data corresponding to this value, perform the argmin operation, then find out where to the original dataframe that was. But there must be a more elegant way to do this.

, , - - :

  cat  val
1   a    6      
4   b    5       
6   c    4  

, , , (, - {'a': 1, 'b': 4, 'c': 6}). , , . - , .

+7
4

argmin() agg, apply :

txt = """  cat  val
0   a    1
1   a    6
2   a   12
3   b    2
4   b    5
5   b   11
6   c    4
7   c   22"""

import io

df = pd.read_csv(io.BytesIO(txt), delim_whitespace=True, index_col=0)
df["val_delt"] = (df.val - 5.5).abs()
idx = df.groupby("cat").apply(lambda df:df.val_delt.argmin())
df.ix[idx, :]

:

cat  val  val_delt
1   a    6       0.5
4   b    5       0.5
6   c    4       1.5
+6

HYRY, idxmin. :

import io
txt = """  cat  val
0   a    1
1   a    6
2   a   12
3   b    2
4   b    5
5   b   11
6   c    4
7   c   22"""
df = pd.read_csv(io.BytesIO(txt.encode()), delim_whitespace=True, index_col=0)
df["val_delt"] = (df.val - 5.5).abs()
idx = df.groupby("cat").apply(lambda df:df.val_delt.idxmin())
df.ix[idx, :]
+4

.

idxmin . , , .

>>> df['val_delt'] = (df.val - 5.5).abs()
>>> df.set_index('val').groupby('cat').idxmin()
     val_delt
cat          
a           6
b           5
c           4
+1

, , . .

>>> indx = df.groupby('cat')['val_delt'].idxmin()
>>> df.loc[indx]

  cat  val  val_delt
1   a    6       0.5
4   b    5       0.5
6   c    4       1.5
-1

All Articles