Range-based bin values with pandas

Question

Range-based bin values with pandas

I have several CSV files with values similar to this in the folder:

GroupID.csv is the name of the file. There are several such files, but value ranges are defined in a single XML file. I am trying to group them. How can I do this?

Update1: Based on BobHaffner's comments, I did it

import pandas as pd import glob path =r'path/to/files' allFiles = glob.glob(path + "/*.csv") frame = pd.DataFrame() list_ = [] for file_ in allFiles: df = pd.read_csv(file_,index_col=None, header=None) df['file'] = os.path.basename('path/to/files/'+file_) list_.append(df) frame = pd.concat(list_) print frame

to get something like this:

I need to group values based on boxes from an XML file. I really appreciate any help.

+5

python numpy pandas csv

pam Jul 31 '15 at 1:23

source share

1 answer

firelynx · Accepted Answer · 2015-07-31T07:50:51+0000

To hang a series, you must use the pd.cut() function , for example:

 df['bin'] = pd.cut(df['1'], [0, 50, 100,200]) 0 1 file bin 0 person1 24 age.csv (0, 50] 1 person2 17 age.csv (0, 50] 2 person3 98 age.csv (50, 100] 3 person4 6 age.csv (0, 50] 4 person2 166 Height.csv (100, 200] 5 person3 125 Height.csv (100, 200] 6 person5 172 Height.csv (100, 200]

If you want to name the bins yourself, you can use the labels= argument, for example:

 df['bin'] = pd.cut(df['1'], [0, 50, 100,200], labels=['0-50', '50-100', '100-200']) 0 1 file bin 0 person1 24 age.csv 0-50 1 person2 17 age.csv 0-50 2 person3 98 age.csv 50-100 3 person4 6 age.csv 0-50 4 person2 166 Height.csv 100-200 5 person3 125 Height.csv 100-200 6 person5 172 Height.csv 100-200

Range-based bin values ​​with pandas

More articles:

Range-based bin values with pandas