Pandas combining a list based on qcut of another list

Question

Pandas combining a list based on qcut of another list

say I have a list:

a = [3, 5, 1, 1, 3, 2, 4, 1, 6, 4, 8]

and optional list a:

 b = [5, 2, 6, 8]

I would like to get cells using pd.qcut(a,2) and count the number of values in each box for list b. it

 In[84]: pd.qcut(a,2) Out[84]: Categorical: [[1, 3], (3, 8], [1, 3], [1, 3], [1, 3], [1, 3], (3, 8], [1, 3], (3, 8], (3, 8], (3, 8]] Levels (2): Index(['[1, 3]', '(3, 8]'], dtype=object)

Now I know that the boxes are [1,3] and (3,8), and I would like to know how many values are in each box for list "b". I can do this manually when the number of pins is small, but what is the best approach when the number of bins is large?

+6

python pandas binning

user2921752 Jan 2 '14 at 10:42

source share

2 answers

alko · Answer 1 · 2014-01-02T23:17:09+0000

You can use retbins paramether to return a bit from qcut:

 >>> q, bins = pd.qcut(a, 2, retbins=True)

Then use pd.cut to get the b indices relative to the beans:

 >>> b = np.array(b) >>> hist = pd.cut(b, bins, right=True).labels >>> hist[b==bins[0]] = 0 >>> hist array([1, 0, 1, 1])

Please note that you need to process the corner case bins[0] separately, as it is not included by cut in the left tray.

feetwet · Answer 2 · 2017-01-18T19:58:49+0000

As shown in an earlier answer: you can get the bin borders from qcut using the retbins parameter, as shown below:

 q, bins = pd.qcut(a, 2, retbins=True)

You can then use cut to put values from another list into these “bins”. For instance:

 myList = np.random.random(100) # Define bin bounds that cover the range returned by random() bins = [0, .1, .9, 1] # Now we can get the "bin number" of each value in myList: binNum = pd.cut(myList, bins, labels=False, include_lowest=True) # And then we can count the number of values in each bin number: np.bincount(binNum)

Make sure that the borders of your bin cover the entire range of values displayed in the second list. To ensure this, you can overlay the boundaries of your bin with a maximum and minimum value. For instance.

 cutBins = [float('-inf')] + bins.tolist() + [float('inf')]

Pandas combining a list based on qcut of another list

More articles: