- determine where cumsum is greater than or equal to 30
- mask lines where it's not
- reassign one line as cumsum less 30
c = df.vals.cumsum() m = c.ge(30) i = m.idxmax() n = df.vals.where(m, 0) n.loc[i] = c.loc[i] - 30 df.assign(vals=n) vals 0 0 1 0 2 0 3 5 4 20
Same but numpy fied
v = df.vals.values c = v.cumsum() m = c >= 30 i = m.argmax() n = np.where(m, v, 0) n[i] = c[i] - 30 df.assign(vals=n) vals 0 0 1 0 2 0 3 5 4 20
The timing
%%timeit v = df.vals.values c = v.cumsum() m = c >= 30 i = m.argmax() n = np.where(m, v, 0) n[i] = c[i] - 30 df.assign(vals=n) 10000 loops, best of 3: 168 ยตs per loop %%timeit c = df.vals.cumsum() m = c.ge(30) i = m.idxmax() n = df.vals.where(m, 0) n.loc[i] = c.loc[i] - 30 df.assign(vals=n) 1000 loops, best of 3: 853 ยตs per loop
source share