I found an error. The error in pandas technically, not seaborn , as I originally thought, although it includes code from pandas , seaborn and matplotlib ...
The following code appears in pandas.tools.plotting.ScatterPlot._make_plot to select the colors that will be used in the scatter graph
if c is None: c_values = self.plt.rcParams['patch.facecolor'] elif c_is_column: c_values = self.data[c].values else: c_values = c
In your case, c will be equal to None , which is the default value, and therefore plt.rcParams['patch.facecolor'] will be set to plt.rcParams['patch.facecolor'] .
Now, as part of the setup, the marine version changes plt.rcParams['patch.facecolor'] to (0.5725490196078431, 0.7764705882352941, 1.0) , which is an RGB tuple. If seaborn not used, the value is the default matplotlib value, which is 'b' (a string indicating the color is blue).
c_values then used later to actually plot inside ax.scatter
scatter = ax.scatter(data[x].values, data[y].values, c=c_values, label=label, cmap=cmap, **self.kwds)
The problem arises because the keyword argument c can take several different types of arguments, it can take: -
- a string (for example,
'b' in the original case of matplotlib); - a sequence of color specifications (for example, a sequence of RGB values);
- a sequence of values to display the current color map.
Matplotlib specs indicate the following: mine allocation
c can be a single color format string or a sequence of color specifications of length N or a sequence of N numbers to be displayed in colors using cmap and the rate specified by kwargs (see below). Note that c should not be a single RGB numerical or RGBA sequence, because it is indistinguishable from an array of values that must be matched. c can be a two-dimensional array in which the rows are RGB or RGBA, however.
What basically happens is that matplotlib takes the value of c_values (which is a tuple of three numbers), and then maps these colors to the current color palette (which is set by default pandas as Greys ). Thus, you get three scatter points with different “grayishness”. When you have more than 3 scatter points, matplotlib assumes that it should be an RGB tuple, because the length does not match the length of the data arrays (3! = 4), and therefore uses it as a constant RBG color.
This was recorded as a bug report in pithas github here .