Python average median pairwise correlation

I return daily from three markets (GLD, SPY and USO). My goal is to calculate the average pairwise correlation from the correlation matrix based on the rental of 130 days.

My starting point:

import numpy as np
import pandas as pd
import os as os
import pandas.io.data as web
import datetime as datetime
from pandas.io.data import DataReader

stocks = ['spy', 'gld', 'uso']
start = datetime.datetime(2010,1,1)
end = datetime.datetime(2016,1,1)

df = web.DataReader(stocks, 'yahoo', start, end)
adj_close_df = df['Adj Close']

returns = adj_close_df.pct_change(1).dropna()
returns = returns.dropna()

rollingcor = returns.rolling(130).corr()

This creates a panel of correlation matrices. However, extracting the lower (or upper) triangles, removing the diagonals, and then calculating the average for each observation is where I drew the space. Ideally, I would like the result for each date to be in a series, where I can then index it by date.

Maybe I started from the wrong place, but any help would be appreciated.

+4
source share
2 answers

, , n ( ), 2 () , , n ( ). , :

>>> n = len(stocks)
>>> ((rollingcor.sum(skipna=0).sum(skipna=0) - n) / 2) / n
Date
2010-01-05         NaN
2010-01-06         NaN
2010-01-07         NaN
                ...   
2015-12-29    0.164356
2015-12-30    0.168102
2015-12-31    0.166462
dtype: float64
+3

numpy tril .

def tril_sum(df):
    # -1 ensures we skip the diagonal
    return np.tril(df.unstack().values, -1).sum()

. unstack() . , , .

n = len(stock)
avg_cor = rollingcor.dropna().to_frame().apply(tril_sum) / ((n ** 2 - n) / 2)

:

print avg_cor.head()

Date
2010-07-12    0.398973
2010-07-13    0.403664
2010-07-14    0.402483
2010-07-15    0.403252
2010-07-16    0.407769
dtype: float64

.

+1

All Articles