In your limited dataset, the following works:
In [125]: df.groupby('positions')['r vals'].filter(lambda x: len(x) >= 3) Out[125]: 0 1.2 2 2.3 3 1.8 6 1.9 Name: r vals, dtype: float64
You can assign the result of this filter and use it with isin to filter your original df:
In [129]: filtered = df.groupby('positions')['r vals'].filter(lambda x: len(x) >= 3) df[df['r vals'].isin(filtered)] Out[129]: r vals positions 0 1.2 1 1 1.8 2 2 2.3 1 3 1.8 1 6 1.9 1
You just need to change 3 to 20 in your case
Another approach would be to use value_counts to create an aggregate series, then we can use this to filter your df:
In [136]: counts = df['positions'].value_counts() counts Out[136]: 1 4 3 2 2 1 dtype: int64 In [137]: counts[counts > 3] Out[137]: 1 4 dtype: int64 In [135]: df[df['positions'].isin(counts[counts > 3].index)] Out[135]: r vals positions 0 1.2 1 2 2.3 1 3 1.8 1 6 1.9 1
EDIT
If you want to filter the groupby object on the data frame, not the series, you can directly call filter on the groupby object:
In [139]: filtered = df.groupby('positions').filter(lambda x: len(x) >= 3) filtered Out[139]: r vals positions 0 1.2 1 2 2.3 1 3 1.8 1 6 1.9 1
Edchum
source share