Pandas skating radial window iterator

If this is a single line, I can get an iterator by following

import pandas as pd import numpy as np a = np.zeros((100,40)) X = pd.DataFrame(a) for index, row in X.iterrows(): print index print row 

Now I want each iterator to return a subset of X [0: 9 ,:], X [5:14 ,:], X [10:19 ,:], etc. How do I achieve this when rolling (pandas.DataFrame.rolling)?

+4
source share
2 answers

I will experiment with the following data framework.

Customization

 import pandas as pd import numpy as np from string import uppercase def generic_portfolio_df(start, end, freq, num_port, num_sec, seed=314): np.random.seed(seed) portfolios = pd.Index(['Portfolio {}'.format(i) for i in uppercase[:num_port]], name='Portfolio') securities = ['s{:02d}'.format(i) for i in range(num_sec)] dates = pd.date_range(start, end, freq=freq) return pd.DataFrame(np.random.rand(len(dates) * num_sec, num_port), index=pd.MultiIndex.from_product([dates, securities], names=['Date', 'Id']), columns=portfolios ).groupby(level=0).apply(lambda x: x / x.sum()) df = generic_portfolio_df('2014-12-31', '2015-05-30', 'BM', 3, 5) df.head(10) 

enter image description here

Now I will introduce a function for folding a series of lines and concatenating into a single data frame, where I will add a top level to the column index, which indicates the location in the roll.

Solution Step 1

 def rolled(df, n): k = range(df.columns.nlevels) _k = [i - len(k) for i in k] myroll = pd.concat([df.shift(i).stack(level=k) for i in range(n)], axis=1, keys=range(n)).unstack(level=_k) return [(i, row.unstack(0)) for i, row in myroll.iterrows()] 

Although its hidden in a function, myroll will look like this:

enter image description here

Now we can use it just like an iterator.

Solution Step 2

 for i, roll in rolled(df.head(5), 3): print roll print 0 1 2 Portfolio Portfolio A 0.326164 NaN NaN Portfolio B 0.201597 NaN NaN Portfolio C 0.085340 NaN NaN 0 1 2 Portfolio Portfolio A 0.278614 0.326164 NaN Portfolio B 0.314448 0.201597 NaN Portfolio C 0.266392 0.085340 NaN 0 1 2 Portfolio Portfolio A 0.258958 0.278614 0.326164 Portfolio B 0.089224 0.314448 0.201597 Portfolio C 0.293570 0.266392 0.085340 0 1 2 Portfolio Portfolio A 0.092760 0.258958 0.278614 Portfolio B 0.262511 0.089224 0.314448 Portfolio C 0.084208 0.293570 0.266392 0 1 2 Portfolio Portfolio A 0.043503 0.092760 0.258958 Portfolio B 0.132221 0.262511 0.089224 Portfolio C 0.270490 0.084208 0.293570 
+5
source

This is not how rental works. It "provides sliding conversions" (from documents ).

You can use pandas loop

 for i in range((X.shape[0] + 9) // 10): X_subset = X.iloc[i * 10: (i + 1) * 10]) 
+1
source

All Articles