Implementing classic martingale using Python and Pandas

I want to implement classic martingale using Python and Pandas in a betting system.

Say this DataFrame is defined as

df = pd.DataFrame(np.random.randint(0,2,100)*2-1, columns=['TossResults']) 

therefore it contains a toss of results (-1 = lose 1 = win)

I would like to change the bet (the size of each bet) using the classic martingale.

The initial bid is 1.

If I lose the bet, it will be 2 times higher than the share (factor = 2).

If I win, the bet will be bet_initial

I made a function

 def stake_martingale_classical(stake_previous, result_previous, multiplier, stake_initial): if (result_previous==-1): # lose stake = stake_previous*multiplier elif (result_previous==1): stake = stake_initial else: raise(Exception('Error result_previous must be equal to 1 (win) or -1 (lose)')) return(stake) 

but I don’t know how to use it effectively with Pandas. I tried this:

 initial_stake = 1 df['Stake'] = None df['Stake'][0] = initial_stake df['TossResultsPrevious'] = self.df['TossResults'].shift(1) # shifting-lagging df['StakePrevious'] = self.df['Stake'].shift(1) # shifting-lagging 

but now I need to apply this function (multi-parameter) along the 0 axis.

I do not know how to act!

I have ever seen the pandas.DataFrame.applymap function, but it seems to be just one parameter function.

Maybe I'm wrong and using shift function is not a good idea

+7
source share
2 answers

One small change in interpretation - you need to mark the loss as 1 and win as 0 .

The first step is to find the edges of the losing tracks ( steps + edges ). Then you need to take the difference in step sizes and move these values ​​back to the original data. When you take cumsum of toss2 , it gives you the current length of your losing streak. Your bet is then 2 ** cumsum(toss2) .

The numpy version is faster than the pandas version, but the coefficient depends on N (~ 8 for N=100 and ~ 2 for N > 10000 ).


pandas

Using pandas.Series :

 import pandas as pd toss = np.random.randint(0,2,100) toss = pd.Series(toss) steps = (toss.cumsum() * toss).diff() # mask out the cumsum where we won [0 1 2 3 0 0 4 5 6 ... ] edges = steps < 0 # find where the cumsum steps down -> where we won dsteps = steps[edges].diff() # find the length of each losing streak dsteps[steps[edges].index[0]] = steps[edges][:1] # fix length of the first run which in now NaN toss2 = toss.copy() # get a copy of the toss series toss2[edges] = dsteps # insert the length of the losing streaks into the copy of the toss results bets = 2 ** (toss2).cumsum() # compute the wagers res = pd.DataFrame({'toss': toss, 'toss2': toss2, 'runs': toss2.cumsum(), 'next_bet': bets}) 

Numpy

This is a pure version of numpy (my native language was). There are few options for tuning arrays for pandas do for you.

 toss = np.random.randint(0,2,100) steps = np.diff(np.cumsum(toss) * toss) edges = steps < 0 edges_shift = np.append(False, edges[:-1]) init_step = steps[edges][0] toss2 = np.array(toss) toss2[edges_shift] = np.append(init_step, np.diff(steps[edges])) bets = 2 ** np.cumsum(toss2) fmt_dict = {1:'l', 0:'w'} for t, b in zip(toss, bets): print fmt_dict[t] + '-> {0:d}'.format(b) 

pandas conclusion

 In [65]: res Out[65]: next_bet runs toss toss2 0 1 0 0 0 1 2 1 1 1 2 4 2 1 1 3 8 3 1 1 4 16 4 1 1 5 1 0 0 -4 6 1 0 0 0 7 2 1 1 1 8 4 2 1 1 9 1 0 0 -2 10 1 0 0 0 11 2 1 1 1 12 4 2 1 1 13 1 0 0 -2 14 1 0 0 0 15 2 1 1 1 16 1 0 0 -1 17 1 0 0 0 18 2 1 1 1 19 1 0 0 -1 20 1 0 0 0 21 1 0 0 0 22 2 1 1 1 23 1 0 0 -1 24 2 1 1 1 25 1 0 0 -1 26 1 0 0 0 27 1 0 0 0 28 2 1 1 1 29 4 2 1 1 30 1 0 0 -2 31 2 1 1 1 32 4 2 1 1 33 1 0 0 -2 34 1 0 0 0 35 1 0 0 0 36 1 0 0 0 37 2 1 1 1 38 4 2 1 1 39 1 0 0 -2 40 2 1 1 1 41 4 2 1 1 42 8 3 1 1 43 1 0 0 -3 44 1 0 0 0 45 1 0 0 0 46 1 0 0 0 47 2 1 1 1 48 1 0 0 -1 49 2 1 1 1 50 1 0 0 -1 51 1 0 0 0 52 1 0 0 0 53 1 0 0 0 54 1 0 0 0 55 2 1 1 1 56 1 0 0 -1 57 1 0 0 0 58 1 0 0 0 59 1 0 0 0 60 1 0 0 0 61 2 1 1 1 62 1 0 0 -1 63 2 1 1 1 64 4 2 1 1 65 8 3 1 1 66 16 4 1 1 67 32 5 1 1 68 1 0 0 -5 69 2 1 1 1 70 1 0 0 -1 71 2 1 1 1 72 4 2 1 1 73 1 0 0 -2 74 2 1 1 1 75 1 0 0 -1 76 1 0 0 0 77 2 1 1 1 78 4 2 1 1 79 1 0 0 -2 80 1 0 0 0 81 2 1 1 1 82 1 0 0 -1 83 1 0 0 0 84 1 0 0 0 85 1 0 0 0 86 2 1 1 1 87 4 2 1 1 88 8 3 1 1 89 16 4 1 1 90 32 5 1 1 91 64 6 1 1 92 1 0 0 -6 93 1 0 0 0 94 1 0 0 0 95 1 0 0 0 96 2 1 1 1 97 1 0 0 -1 98 1 0 0 0 99 1 0 0 0 

numpy output

(different seeds than panadas results)

 (result -> next bet): w-> 1 l-> 2 w-> 1 w-> 1 l-> 2 w-> 1 l-> 2 w-> 1 l-> 2 l-> 4 w-> 1 l-> 2 w-> 1 l-> 2 l-> 4 w-> 1 w-> 1 w-> 1 l-> 2 l-> 4 l-> 8 w-> 1 l-> 2 l-> 4 w-> 1 l-> 2 l-> 4 w-> 1 w-> 1 l-> 2 w-> 1 w-> 1 w-> 1 w-> 1 l-> 2 l-> 4 w-> 1 w-> 1 l-> 2 l-> 4 l-> 8 w-> 1 w-> 1 l-> 2 l-> 4 w-> 1 w-> 1 w-> 1 w-> 1 w-> 1 w-> 1 l-> 2 w-> 1 l-> 2 w-> 1 l-> 2 w-> 1 w-> 1 w-> 1 w-> 1 w-> 1 w-> 1 l-> 2 l-> 4 l-> 8 l-> 16 w-> 1 l-> 2 l-> 4 w-> 1 w-> 1 w-> 1 w-> 1 l-> 2 w-> 1 w-> 1 l-> 2 w-> 1 w-> 1 w-> 1 l-> 2 w-> 1 w-> 1 w-> 1 w-> 1 w-> 1 w-> 1 l-> 2 l-> 4 l-> 8 w-> 1 w-> 1 l-> 2 l-> 4 l-> 8 w-> 1 l-> 2 l-> 4 w-> 1 l-> 2 
+4
source

Pandas will get maximum efficiency when you can use vectorized operations, but I think this problem requires iteration. Solution using pandas:

 import pandas as pd import numpy as np df = pd.DataFrame(np.random.randint(0,2,100)*2-1, columns=['TossResults']) initial_stake = 1 df['Stake'] = initial_stake for i in xrange(1,df.shape[0]): if df.TossResults[i-1] == -1: df.Stake[i] = 2 * df.Stake[i-1] 
+2
source

All Articles