I have good news and bad news. The good news: I have something vectorized, which is about 300 times faster, but the bad news is that I cannot reproduce the results. But I think you should use the principles here to speed up your code significantly, even if that code does not actually replicate your results at the moment.
df['result'] = np.where( df['A'] > 0, df.shift(365).rolling(10).B.mean(), df.shift(365).rolling(20).B.mean() )
The hard (slow) part of your code is this:
df2=df[df.dte<lastyear].head(depth)
However, while your dates are only 365 days away, you can use a code that is vectorized and much faster:
df.shift(365).rolling(10).B.mean()
shift(365) replaces df.dte < lastyear , and rolling().mean() replaces head().mean() . It will be much faster and less memory.
And in fact, even if your dates are not completely regular, you can probably remake and work that way. Or, somewhat equivalent, if you specify the date of your index, the shift can be made to work on the basis of frequency, not lines (for example, a shift of 365 days, even if it is not 365 lines). It would probably be nice to "dte" your index here independently.
John
source share