Convert daily stock data to weekly version via pandas in Python

I have a DataFrame data containing daily data that is listed below:

 Date Open High Low Close Volume 2010-01-04 38.660000 39.299999 38.509998 39.279999 1293400 2010-01-05 39.389999 39.520000 39.029999 39.430000 1261400 2010-01-06 39.549999 40.700001 39.020000 40.250000 1879800 2010-01-07 40.090000 40.349998 39.910000 40.090000 836400 2010-01-08 40.139999 40.310001 39.720001 40.290001 654600 2010-01-11 40.209999 40.520000 40.040001 40.290001 963600 2010-01-12 40.160000 40.340000 39.279999 39.980000 1012800 2010-01-13 39.930000 40.669998 39.709999 40.560001 1773400 2010-01-14 40.490002 40.970001 40.189999 40.520000 1240600 2010-01-15 40.570000 40.939999 40.099998 40.450001 1244200 

What I intend to do is combine it into weekly data. After grouping:

  • The date should be held every Monday (at the moment, the holiday scenario should be taken into account when Monday is not a trading day, we must apply the first trading day in the current week as the date).
  • Open should be Monday (or the first trading day of the current week).
  • Close should be on Friday (or on the last trading day of the current week).
  • High should be the highest high of trading days this week.
  • Low should be the lowest Low of trading days this week.
  • Volumn should be the sum of all volumes of trading days in the current week.

which should look like this:

 Date Open High Low Close Volume 2010-01-04 38.660000 40.700001 38.509998 40.290001 5925600 2010-01-11 40.209999 40.970001 39.279999 40.450001 6234600 

Currently, my code snippet is as follows: what function should I use to match daily data with expected weekly data? Thank you very much!

 import pandas_datareader.data as web start = datetime.datetime(2010, 1, 1) end = datetime.datetime(2016, 12, 31) f = web.DataReader("MNST", "yahoo", start, end, session=session) print f 
+8
source share
2 answers

You can execute resample (per week), offset ( apply ) and apply aggregation rules as follows:

 logic = {'Open' : 'first', 'High' : 'max', 'Low' : 'min', 'Close' : 'last', 'Volume': 'sum'} offset = pd.offsets.timedelta(days=-6) f = pd.read_clipboard(parse_dates=['Date'], index_col=['Date']) f.resample('W', loffset=offset).apply(logic) 

receive:

  Open High Low Close Volume Date 2010-01-04 38.660000 40.700001 38.509998 40.290001 5925600 2010-01-11 40.209999 40.970001 39.279999 40.450001 6234600 
+7
source

In general, if you have a data frame in the form you specified, you need to follow these steps:

  1. put a Date in the pointer
  2. resample index.

You have a case of applying different functions to different columns. See

You can change the selection in various ways. For example, you can take the average or count or so on. check re-fetch pandas .

You can also apply custom aggregators (check the same link). With this in mind, the code snippet for your case can be represented as:

 f['Date'] = pd.to_datetime(f['Date']) f.set_index('Date', inplace=True) f.sort_index(inplace=True) def take_first(array_like): return array_like[0] def take_last(array_like): return array_like[-1] output = f.resample('W', # Weekly resample how={'Open': take_first, 'High': 'max', 'Low': 'min', 'Close': take_last, 'Volume': 'sum'}, loffset=pd.offsets.timedelta(days=-6)) # to put the labels to Monday output = output[['Open', 'High', 'Low', 'Close', 'Volume']] 

Here, W stands for weekly resampling, which by default covers the period from Monday to Sunday. To label loffset Mondays, use loffset . There are several predefined day qualifiers. Look at the pandas offsets . You can even define custom offsets ( see ).

Returning to the recount method . Here for Open and Close you can specify custom methods to get the first value, etc. And passing the function handle to the how argument.

This answer is based on the assumption that the data seems to be daily, that is, for each day you have only 1 record. Also there is no data for non-working days. those. Sat and Sun Thus, if you take the last data point for the week, then on Friday everything is in order. If you want, you can use the work week instead of "W". In addition, for more complex data, you can use groupby to group weekly data, and then work with time indices inside them.

By the way, the essence of the solution can be found at: https://gist.github.com/prithwi/339f87bf9c3c37bb3188.

+9
source

All Articles