I get better performance with ne instead of actually comparing != :
df['changed'] = df['ColumnB'].ne(df['ColumnB'].shift().bfill()).astype(int)
Delay
Using the following setting to create a larger data block:
df = pd.concat([df]*10**5, ignore_index=True)
I get the following timings:
%timeit df['ColumnB'].ne(df['ColumnB'].shift().bfill()).astype(int) 10 loops, best of 3: 38.1 ms per loop %timeit (df.ColumnB != df.ColumnB.shift()).astype(int) 10 loops, best of 3: 77.7 ms per loop %timeit df['ColumnB'] == df['ColumnB'].shift(1).fillna(df['ColumnB']) 10 loops, best of 3: 99.6 ms per loop %timeit (df.ColumnB.ne(df.ColumnB.shift())).astype(int) 10 loops, best of 3: 19.3 ms per loop
root
source share