I am making a graph in ggplot2 consisting of a set of data points plotted as points, with the lines predicted by the installed overlay model. The general idea of the chart looks something like this:
names <- c(1,1,1,2,2,2,3,3,3) xvals <- c(1:9) yvals <- c(1,2,3,10,11,12,15,16,17) pvals <- c(1.1,2.1,3.1,11,12,13,14,15,16) ex_data <- data.frame(names,xvals,yvals,pvals) ex_data$names <- factor(ex_data$names) graph <- ggplot(data=ex_data, aes(x=xvals, y=yvals, color=names)) print(graph + geom_point() + geom_line(aes(x=xvals, y=pvals)))
As you can see, both lines and dots are colored by a categorical variable (in this case, “names”). I would like the legend to contain 2 entries: a dot labeled “data” and a string labeled “Fitted” (to indicate that the dots are real data, and the lines fit). However, I cannot get this to work. The (Awesome) guide here is great for formatting, but doesn't deal with the actual recordings, while I tried the technique here to no avail, i.e.
print(graph + scale_colour_manual("", values=c("green", "blue", "red")) + scale_shape_manual("", values=c(19,NA,NA)) + scale_linetype_manual("",values=c(0,1,1)))
The main problem is that in my actual data there are> 200 different categories for "names", while I only need 2 entries mentioned above in the legend. Doing this with my actual data simply creates a meaningless legend that runs away from the page because the legend is trying to be the key to the colors (which I have too many).
I would be grateful for any help!