Python: checking which cell the value belongs to.

Question

Python: checking which cell the value belongs to.

I have a list of values and a list of edges. Now I need to check all the values in which bin they belong to. Is there a more pythonic way than repeating the values and then over the bins and checking if the value belongs to the current box, for example:

my_list = [3,2,56,4,32,4,7,88,4,3,4] bins = [0,20,40,60,80,100] for i in my_list: for j in range(len(bins)): if bins(j) < i < bins(j+1): DO SOMETHING

It doesn't look very pretty to me. Thanks!

+6

python range binning

frixhax Feb 19 '13 at 0:41

source share

3 answers

First of all, your code will fail in cases where the value is equal to the border of the bean -

change

 if bins(j) < i < bins(j+1):

to have a <= sign somewhere.

After that use the bisect module

 import bisect bisect.bisect(x, bins)

or bisect.bisect_right

depending on whether you want to take the upper or lower bit when the value is on the border of the hopper.

+3

gbronner Feb 19 '13 at 0:46

source share

Perhaps this will help you on the right track:

 >>> import itertools >>> my_list = [3,2,56,4,32,4,7,88,4,3,4] >>> for k, g in itertools.groupby(sorted(my_list), lambda x: x // 20 * 20): ... print k, list(g) ... 0 [2, 3, 3, 4, 4, 4, 4, 7] 20 [32] 40 [56] 80 [88]

+2

sberry Feb 19 '13 at 0:45

source share

Dirk · Accepted Answer · 2013-06-02T11:36:46+0000

Maybe too late, but for a future reference, numpy has a function that does just that:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.digitize.html

 >>> my_list = [3,2,56,4,32,4,7,88,4,3,4] >>> bins = [0,20,40,60,80,100] >>> np.digitize(my_list,bins) array([1, 1, 3, 1, 2, 1, 1, 5, 1, 1, 1])

The result is an array of indices corresponding to the bin box that belongs to each item from my_list. Note that the function will also have bin values that go beyond your first and last edges of the boxes:

 >>> my_list = [-5,200] >>> np.digitize(my_list,bins) array([0, 6])

And Pandas is also something similar:

http://pandas.pydata.org/pandas-docs/dev/basics.html#discretization-and-quantiling

 >>> pd.cut(my_list, bins) Categorical: array(['(0, 20]', '(0, 20]', '(40, 60]', '(0, 20]', '(20, 40]', '(0, 20]', '(0, 20]', '(80, 100]', '(0, 20]', '(0, 20]', '(0, 20]'], dtype=object) Levels (5): Index(['(0, 20]', '(20, 40]', '(40, 60]', '(60, 80]', '(80, 100]'], dtype=object)

Python: checking which cell the value belongs to.

More articles: