UPDATE: I found the answer ... included it below.
I have a dataset containing the following variables and similar values:
COBSDATE, CITY, RESPONSE_TIME 2011-11-23 A 1.1 2011-11-23 A 1.5 2011-11-23 A 1.2 2011-11-23 B 2.3 2011-11-23 B 2.1 2011-11-23 B 1.8 2011-11-23 C 1.4 2011-11-23 C 6.1 2011-11-23 A 3.1 2011-11-23 A 1.1
I have successfully created a graph that displays all response_time values โโand smooth geometry to further describe some of the changes.
The problem I am experiencing is that I want to better look at the smoothed value, and in one of the cities there are frequent โoutliersโ. I can control this by adding ylim (0, p99) to the graph, but this leads to the fact that smooth calculation is calculated only by a subset of the data.
Is there a way to use all this data for a smoothed graph and a single subset of the jitter graph?
My code is here (both are the same, except for + ylim(0,20) : truncated -
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + geom_jitter(colour=alpha("#007DB1", 1/8)) + geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + ylim(0,20) + facet_wrap(~CITY)
Full dataset -
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) + geom_jitter(colour=alpha("#007DB1", 1/8)) + geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) + facet_wrap(~CITY)
Benh
source share