First, your last line of code:
hdf = pd.DataFrame(prtns)/(pd.DataFrame(h).cummax()[1:len(h)]-1))[1:len(h)]]
may not be right. Perhaps this matches your R code:
hdf = (pd.DataFrame(prtns)/(pd.DataFrame(h).cummax()[1:len(h)])-1)[1:len(h)]
Secondly, c(1,p.rtns) can be replaced with np.hstack(1, prtns) instead of converting np.array to list .
Thirdly, it looks like you are using pandas only for cummax() . It is easy to implement, for example:
def cummax(a): ac=a.copy() if a.size>0: max_idx=np.argmax(a) ac[max_idx:]=np.max(ac) ac[:max_idx]=cummax(ac[:max_idx]) else: pass return ac
and
>>> a=np.random.randint(0,20,size=10) >>> a array([15, 15, 15, 8, 5, 14, 6, 18, 9, 1]) >>> cummax(a) array([15, 15, 15, 15, 15, 15, 15, 18, 18, 18])
Take it all together:
def run_simulation(mu, sigma, days, n): result=[] for i in range(n): rtns = np.random.normal(loc=1.*mu/days, scale=(((1./days)**0.5)*sigma), size=days) p_rtns = (rtns+1).cumprod() tot_rtn = p_rtns[-1]-1
and
>>> run_simulation(0.06, 0.2, 250,10) [(0.096077511394818016, -0.16621830496112056), (0.73729333554192, -0.13566124517484235), (0.087761655465907973, -0.17862916081223446), (0.07434851091082928, -0.15972961033789046), (-0.094464694393288307, -0.2317397117033817), (-0.090720761054686627, -0.1454002204893271), (0.02221364097529932, -0.15606214341947877), (-0.12362835704696629, -0.24323096421682033), (0.023089144896788261, -0.16916790589553599), (0.39777037782177493, -0.10524624505023494)]
The loop is not really needed, because we can work in two dimensions by creating a 2D array Guass random variable (changing size=days to size=(days, n) ). Most likely, avoiding the cycle will be faster. However, this requires another cummax() function, as it is shown here that it is limited to 1D. But cummax() in R also limited to 1D (not really, if you pass 2D to cummax() , it will be flattened). Therefore, to keep things simple and comparable between Python and R , leave a loop for the version.