How to create a multiple-scatter plot of series with connected dots using a marine vessel?

I have a dataset stored in the pandas framework. I am trying to use seaborn pointplot () to create a scatter plot with multiple rows with connected dots. Each series has different values ​​(x, y), and they are saved as floats in my data frame. Each row has a label, differentiating each series. I am using Python 2.7, marine version 0.5.1 and matplotlib version 1.4.3.

Everything I managed to find tells me that I can achieve this with the following:

import matplotlib.pyplot as plt import seaborn as sns # Suppose my dataframe is called 'df', with columns 'x', 'y', and 'label'. sns.pointplot(x = 'x', y = 'y', hue = 'label', data = df) 

However, this leads to some weird behavior:

  • Colors are correctly identified, but only a few are related.
  • The numbers on the x-axis overlap, and it seems that each data point is marked with its value, and does not scale it with the corresponding, pure values ​​(it seems that it processes the x data as a line / label, rather than floating).

I tried to get around this by dividing the data frames into parts. This is not ideal, because I can have about 10+ episodes to create a graph at the same time, and I would prefer not to split the data manually:

 df1 = df[df.test_type.values == "label 1"] df2 = df[df.test_type.values == "label 2"] ax = sns.pointplot(x = 'x',y='y', color = "blue", data = df1) sns.pointplot(x = 'x', y = 'y', data = df2, color="red", ax = ax) 

In this case, all the points are connected, and they are colored accordingly, but again, the x axis shows a very strange behavior. Despite the fact that my x values ​​from each data frame are different, the graph aligns them so that they seem the same.

Now I'm not sure how to publish my output, but some of my problems can be recreated with the following:

 #import the necessary modules import matplotlib.pyplot as plt import pandas as pd import seaborn as sns #Here is some sample data. The 'x2' data is slightly offset from 'x1' x1 = range(0,100,10) x2 = range(1,100,10) x = x1+x2 #The y-values I generate here mimic the general shape of my actual data y1 = x1[::-1] y2 = [i+25 for i in x1[::-1]] y = y1+y2 #Two levels of labels that will be applied to the data z1 = ["1"]*10 z2 = ["2"]*10 z = z1+z2 #A pandas data frame from the above data df = pd.DataFrame({'x': x, 'y': y, 'z': z}) #Pointplot using the above data sns.pointplot(x = 'x', y = 'y', data = df, hue = 'z') 

Running this code leads to the following:

  • All x values ​​in all series are evenly distributed. Note that the β€œx2” values ​​are the same as β€œx1” translated to β€œ1”, and they are spaced at intervals of 10 in each series. I did not expect such behavior.
  • The x axis does not have a β€œclean” view of the scale. It literally denotes each point corresponding to an x-value. It correctly marks the points, but does not scale them accordingly. It looks like it treats the x values ​​as labels, similar to how a histogram can behave.
  • The dots are correctly colored, but the dots are not connected.

To summarize my question:

Is there an easier / better / more elegant way to build multi-series scatter plots with connected dots using the data stored in the pandas data frame? The Seaborn pointplot looked perfect, but it does not work as I expected, and I suspect that it may serve a purpose other than what I need to accomplish. I am open to other solutions that can achieve this (preferably using python).

Thanks in advance. I will update my question if I can figure out how to load output and graphs from my code.

I am 100% new to stackoverflow. I would like to clarify my question by posting charts created by my code, but I could not figure it out. Any pointers on how to do this would be much appreciated, so I can update the question.

EDIT: It turns out that the sea point uses the x axis as a categorical axis, which explains the strange behavior that I mentioned above. Is there a way to manually change the behavior of the x axis from categorical to numerical? This seems like the easiest approach, but I'm not very good at fine-tuning in python.

+6
source share
2 answers

Using @mwaskom and this question I managed to find a solution to my question:

 #Assuming df is a pandas data frame with columns 'x', 'y', and 'label' for key,grp in df.groupby('label'): plt.plot(grp.x,grp.y,'o-',label = key) plt.legend(loc = 'best') 
+4
source

I had a similar problem and I finally decided to use it with Seaborn FacetGrid . I used plt.scatter for dots and plt.plot for lines connecting dots.

 g = sns.FacetGrid(df, hue="z", size=8) g.map(plt.scatter, "x", "y") g.map(plt.plot, "x", "y") 

Time Series Graphs

Please note that this is done in version 6.0.0, but not in version 0.5.1.

+3
source

All Articles