R / ggplot2: smooth across the entire dataset while using the cover

UPDATE: I found the answer ... included it below.

I have a dataset containing the following variables and similar values:

COBSDATE, CITY, RESPONSE_TIME 2011-11-23 A 1.1 2011-11-23 A 1.5 2011-11-23 A 1.2 2011-11-23 B 2.3 2011-11-23 B 2.1 2011-11-23 B 1.8 2011-11-23 C 1.4 2011-11-23 C 6.1 2011-11-23 A 3.1 2011-11-23 A 1.1 

I have successfully created a graph that displays all response_time values โ€‹โ€‹and smooth geometry to further describe some of the changes.

The problem I am experiencing is that I want to better look at the smoothed value, and in one of the cities there are frequent โ€œoutliersโ€. I can control this by adding ylim (0, p99) to the graph, but this leads to the fact that smooth calculation is calculated only by a subset of the data.

Is there a way to use all this data for a smoothed graph and a single subset of the jitter graph?

My code is here (both are the same, except for + ylim(0,20) : truncated -

 ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + geom_jitter(colour=alpha("#007DB1", 1/8)) + geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + ylim(0,20) + facet_wrap(~CITY) 

Full dataset -

 ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + geom_jitter(colour=alpha("#007DB1", 1/8)) + geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + facet_wrap(~CITY) 
+7
source share
2 answers

If you just want to zoom in, you can use coord_cartesian :

 ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + geom_jitter(colour=alpha("#007DB1", 1/8)) + geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + coord_cartesian(ylim=c(0,20)) + facet_wrap(~CITY) 

If you want to use a subset of data for jitter geometry, then override data inheritance:

 ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + geom_jitter(data=subset(dataRaw, RESPONSE_TIME>=0 & RESPONSE_TIME<=20), colour=alpha("#007DB1", 1/8)) + geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + ylim(0,20) + facet_wrap(~CITY) 
+11
source

UPDATED ANSWER: So, I was looking for something completely different and came across an answer that I need.

Instead of ylim(0,yMax) should use coord_cartesian(ylim = c(0, yMax))

It seems that coord_cartesian simply โ€œscalesโ€ the graph instead of truncating the included data.

+4
source

All Articles