Pandas: use iterrows in a subset of Dataframe

What is the best way to make iterrows with a subset of a DataFrame?

Take the following simple example:

import pandas as pd df = pd.DataFrame({ 'Product': list('AAAABBAA'), 'Quantity': [5,2,5,10,1,5,2,3], 'Start' : [ DT.datetime(2013,1,1,9,0), DT.datetime(2013,1,1,8,5), DT.datetime(2013,2,5,14,0), DT.datetime(2013,2,5,16,0), DT.datetime(2013,2,8,20,0), DT.datetime(2013,2,8,16,50), DT.datetime(2013,2,8,7,0), DT.datetime(2013,7,4,8,0)]}) df = df.set_index(['Start']) 

Now, I would like to change a subset of this DataFrame using the itterrows function, for example:

 for i, row_i in df[df.Product == 'A'].iterrows(): row_i['Product'] = 'A1' # actually a more complex calculation 

However, the changes are not saved.

Is it possible (other than manual search using the index "i") to make permanent changes to the original Dataframe?

+7
python loops pandas subset
source share
2 answers

Why do we need iterrows () for this? I think it is always advisable to use vectorized operations in pandas (or numpy):

 df.ix[df['Product'] == 'A', "Product"] = 'A1' 
+2
source share

I think the best way that comes to my mind is to create a new vector with the desired result, where you can color whatever you want and then reassign it back to the column

 #make a copy of the column P = df.Product.copy() #do the operation or loop if you really must P[ P=="A" ] = "A1" #reassign to original df df["Product"] = P 
0
source share

All Articles