Using Pandas resample, then populating the original data frame

I study market statistics based on close prices for the week and open prices next week. For this, I use resample in Pandas. To give an example, I use pandas DataReader below.

 from pandas.io.data import DataReader 

First, to get daily market data:

 SP = DataReader("^GSPC", "yahoo") del SP['Adj Close'] del SP['Volume'] SP.head() Open High Low Close Date 2010-01-04 1116.560059 1133.869995 1116.560059 1132.989990 2010-01-05 1132.660034 1136.630005 1129.660034 1136.520020 

Now resample to the weekly timeframe:

 ohlc_dict = { 'Open':'first', 'Close': 'last'} w1_resamp = SP.resample('1w',how=ohlc_dict, closed='left', label='left') 

This gives me weekly private and public data. Now I have highlighted the distance between closing last week and open this week according to the np.where instruction.

 w1_resamp['distance'] = np.where(w1_resamp['Open'] < w1_resamp['Close'].shift(),(w1_resamp["Close"].shift() - w1_resamp["Open"]),'np.nan'); Close Open distance Date 2010-01-03 1144.979980 1116.560059 2010-01-10 1136.030029 1145.959961 2010-01-17 1091.760010 1136.030029 2010-01-24 1073.869995 1092.400024 2010-01-31 1066.189941 1073.890015 2010-02-07 1075.510010 1065.510010 0.6799310000001242 2010-02-14 1109.170044 1079.130005 2010-02-21 1104.489990 1110.000000 2010-02-28 1138.699951 1105.359985 2010-03-07 1149.989990 1138.400024 0.29992700000002515 2010-03-14 1159.900024 1148.530029 1.4599610000000212 

Now I want to add a new column to the original data frame showing the time and date when the space (as highlighted in w1_resamp['distance'] ) was closed, but I have no idea how to do it ... can anyone help?

An image is added showing the desired result in the SP data frame as requested in the comments:

desired result

+5
source share
1 answer

I don’t follow your request around the β€œGap Closed” field, but I can try this and see if you can apply it to the index to calculate the date.

For your information, it seems that the "how" method is being excluded, and it displays a warning for use .apply ()

 import pandas as pd import numpy as np idx = pd.date_range("2018-01-01","2018-12-31") columns = ['open','close'] data = np.random.normal(365,2) df = pd.DataFrame(np.random.random((len(idx),len(columns))), columns = columns,index=idx) df['high'] = df['open']*(1+np.random.uniform(.05, .20)) #bull market... func = { 'open': df['open'].resample('1w').first(), 'close': df['close'].resample('1w').last(), 'high': df['high'].resample('1w').max() } df_w = pd.DataFrame(func) df_w['oc_diff'] = df_w['open'] - df_w['close'].shift() df_w.head(10) open close high oc_diff 2018-01-07 0.268054 0.352703 1.186531 NaN 2018-01-14 0.340011 0.907513 1.127548 -0.012693 2018-01-21 0.764949 0.907459 0.915084 -0.142564 2018-01-28 0.346734 0.703151 1.027472 -0.560725 2018-02-04 0.231348 0.960882 0.911420 -0.471803 
0
source

All Articles