Correlation heatmap

Question

Correlation heatmap

I want to represent the correlation matrix using a heat map. There is something called correlogram in R , but I don't think there is such a thing in Python.

How can i do this? Values go from -1 to 1, for example:

[[ 1. 0.00279981 0.95173379 0.02486161 -0.00324926 -0.00432099] [ 0.00279981 1. 0.17728303 0.64425774 0.30735071 0.37379443] [ 0.95173379 0.17728303 1. 0.27072266 0.02549031 0.03324756] [ 0.02486161 0.64425774 0.27072266 1. 0.18336236 0.18913512] [-0.00324926 0.30735071 0.02549031 0.18336236 1. 0.77678274] [-0.00432099 0.37379443 0.03324756 0.18913512 0.77678274 1. ]]

I was able to create the following heatmap based on another question , but the problem is that my values are "clipped" to 0, so I would like to have a card that goes from blue (-1) to red (1), or whatever- something like that, but here values below 0 are not adequately represented.

Here is the code for this:

 plt.imshow(correlation_matrix,cmap='hot',interpolation='nearest')

+26

python correlation

Marko Sep 09 '16 at 10:48

source share

5 answers

The code below will create this graph:

 import pandas as pd import seaborn as sns import matplotlib.pyplot as plt import numpy as np # A list with your data slightly edited l = [1.0,0.00279981,0.95173379,0.02486161,-0.00324926,-0.00432099, 0.00279981,1.0,0.17728303,0.64425774,0.30735071,0.37379443, 0.95173379,0.17728303,1.0,0.27072266,0.02549031,0.03324756, 0.02486161,0.64425774,0.27072266,1.0,0.18336236,0.18913512, -0.00324926,0.30735071,0.02549031,0.18336236,1.0,0.77678274, -0.00432099,0.37379443,0.03324756,0.18913512,0.77678274,1.00] # Split list n = 6 data = [l[i:i + n] for i in range(0, len(l), n)] # A dataframe df = pd.DataFrame(data) def CorrMtx(df, dropDuplicates = True): # Your dataset is already a correlation matrix. # If you have a dateset where you need to include the calculation # of a correlation matrix, just uncomment the line below: # df = df.corr() # Exclude duplicate correlations by masking uper right values if dropDuplicates: mask = np.zeros_like(df, dtype=np.bool) mask[np.triu_indices_from(mask)] = True # Set background color / chart style sns.set_style(style = 'white') # Set up matplotlib figure f, ax = plt.subplots(figsize=(11, 9)) # Add diverging colormap from red to blue cmap = sns.diverging_palette(250, 10, as_cmap=True) # Draw correlation plot with or without duplicates if dropDuplicates: sns.heatmap(df, mask=mask, cmap=cmap, square=True, linewidth=.5, cbar_kws={"shrink": .5}, ax=ax) else: sns.heatmap(df, cmap=cmap, square=True, linewidth=.5, cbar_kws={"shrink": .5}, ax=ax) CorrMtx(df, dropDuplicates = False)

I put it all together after it was announced that the outstanding seaborn corrplot should be deprecated. The above fragment of the seaborn heatmap resembles a correlation graph based on the seaborn heatmap . You can also specify a color range and indicate whether duplicate correlations should be removed. Notice that I used the same numbers as you, but put them in the pandas data frame. Regarding the choice of colors, you can take a look at the docs for sns.diverging_palette . You requested a blue color, but it goes beyond that specific range of the color scale with your sample data. For both observations 0.95173379 try changing to -0.95173379 and you will get this:

+5

vestland Jan 16 '18 at 9:45

source share

If the data is in the DataFrame panda, you can use the heatmap function to create the desired section.

 import seaborn as sns Var_Corr = df.corr() # plot the heatmap and annotation on it sns.heatmap(Var_Corr, xticklabels=Var_Corr.columns, yticklabels=Var_Corr.columns, annot=True)

Correlation plot

From the question, it looks like the data is in a NumPy array. If this array is named numpy_data , before you can use the step above, you can put it in the DataFrame Pandas using the following:

 import pandas as pd df = pd.DataFrame(numpy_data)

+5

fati heidari Apr 05 '18 at 19:02

source share

You can use matplotlib for this. There is a similar question that shows how you can achieve what you want: Drawing a 2D heat map using Matplotlib

0

Bernhard Sep 09 '16 at 10:54

source share

Use the jet color package to switch between blue and red.
Use pcolor() with parameters vmin , vmax .

Details in this answer: fooobar.com/questions/68118 / ...

0

ypnos Sep 09 '16 at 11:06

source share

mrandrewandrade · Accepted Answer · 2017-02-19T03:05:34+0000

Another alternative is to use the heat map function in the seabed to build covariance. This example uses the Auto dataset from the ISLR package in R (the same as the example you showed).

 import pandas.rpy.common as com import seaborn as sns %matplotlib inline # load the R package ISLR infert = com.importr("ISLR") # load the Auto dataset auto_df = com.load_data('Auto') # calculate the correlation matrix corr = auto_df.corr() # plot the heatmap sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns)

If you want to be even more bizarre, you can use Pandas Style , for example:

 cmap = cmap=sns.diverging_palette(5, 250, as_cmap=True) def magnify(): return [dict(selector="th", props=[("font-size", "7pt")]), dict(selector="td", props=[('padding', "0em 0em")]), dict(selector="th:hover", props=[("font-size", "12pt")]), dict(selector="tr:hover td:hover", props=[('max-width', '200px'), ('font-size', '12pt')]) ] corr.style.background_gradient(cmap, axis=1)\ .set_properties(**{'max-width': '80px', 'font-size': '10pt'})\ .set_caption("Hover to magify")\ .set_precision(2)\ .set_table_styles(magnify())

Correlation heatmap

More articles: