Seasonal groupings using python and pandas

I want to use Pandas and Python to iterate through my .csv file and group the data by season, calculating the average for each season of the year. Currently, the quarterly script does Jan-Mar, April-June, etc. I want the seasons to correspond to the months - 11: "Winter", 12: "Winter", 1: "Winter", 2: "Spring", 3: 'Spring', 4: 'Spring', 5: 'Summer', 6: 'Summer', 7: 'Summer', \ 8: “Autumn”, 9: “Autumn”, 10: “Autumn”

I have the following data:

Date,HAD 01/01/1951,1 02/01/1951,-0.13161201 03/01/1951,-0.271796132 04/01/1951,-0.258977158 05/01/1951,-0.198823057 06/01/1951,0.167794502 07/01/1951,0.046093808 08/01/1951,-0.122396694 09/01/1951,-0.121824587 10/01/1951,-0.013002463 

This is my code:

 # Iterate through a list of files in a folder looking for .csv files for csvfilename in glob.glob("C:/Users/n-jones/testdir/output/*.csv"): # Allocate a new file name for each file and create a new .csv file csvfilenameonly = "RBI-Seasons-Year" + path_leaf(csvfilename) with open("C:/Users/n-jones/testdir/season/" + csvfilenameonly, "wb") as outfile: # Open the input csv file and allow the script to read it with open(csvfilename, "rb") as infile: # Create a pandas dataframe to summarise the data df = pd.read_csv(infile, parse_dates=[0], index_col=[0], dayfirst=True) mean = df.resample('Q-SEP', how='mean') # Output to new csv file mean.to_csv(outfile) 

Hope this makes sense.

Thank you in advance!

+3
source share
1 answer

It sounds like you just need a search term and a group. The code below should work.

 import pandas as pd import os import re lookup = { 11: 'Winter', 12: 'Winter', 1: 'Winter', 2: 'Spring', 3: 'Spring', 4: 'Spring', 5: 'Summer', 6: 'Summer', 7: 'Summer', 8: 'Autumn', 9: 'Autumn', 10: 'Autumn' } os.chdir('C:/Users/n-jones/testdir/output/') for fname in os.listdir('.'): if re.match(".*csv$", fname): data = pd.read_csv(fname, parse_dates=[0], dayfirst=True) data['Season'] = data['Date'].apply(lambda x: lookup[x.month]) data['count'] = 1 data = data.groupby(['Season'])['HAD', 'count'].sum() data['mean'] = data['HAD'] / data['count'] data.to_csv('C:/Users/n-jones/testdir/season/' + fname) 
+1
source

Source: https://habr.com/ru/post/1213851/


All Articles