Can I set dataframe values ​​without using iterrows ()?

Source DataSet

In [2]: import pandas as pd ...: ...: # Original DataSet ...: d = {'A': [1,1,1,1,2,2,2,2,3], ...: 'B': ['a','a','a','x','b','b','b','x','c'], ...: 'C': [11,22,33,44,55,66,77,88,99],} ...: ...: df = pd.DataFrame(d) ...: df Out[2]: ABC 0 1 a 11 1 1 a 22 2 1 a 33 3 1 x 44 4 2 b 55 5 2 b 66 6 2 b 77 7 2 x 88 8 3 c 99 

Given a data frame, I would like to have a flexible, efficient way to reset specific values ​​based on certain conditions in two columns.

Terms:

  • in column B: for any row with a value of "x",
  • in column C: set the value of these row items to the value of the next row.

Desired Result

 Out[3]: ABC 0 1 a 11 1 1 a 22 2 1 a 33 3 1 x 55 4 2 b 55 5 2 b 66 6 2 b 77 7 2 x 99 8 3 c 99 

I found out that I can accomplish this using iterrows() (see below),

 # Code that produces the above outcome for idx, x_row in df[df['B'] == 'x'].iterrows(): df.loc[idx, 'C'] = df.loc[idx+1, 'C'] df 

but I need to do this many times, and I understand that iterrows() is slow . Are there any better pandas -y broadcast-facilitating ways to get the desired result more efficiently?

+5
source share
1 answer

This should do what you want:

 df.C[df.B == 'x'] = df.C.shift(-1) 
+4
source

All Articles