I am trying to get an hourly frequency overview of my wrt workday data. Therefore, I reduced the different dates to one day, so that only the time is different and adds a column that represents the day of the week as an ordered factor.
The following is the extraction of my data:
my.log <- structure(list(Prorated = structure(c(1339535400, 1339536540, 1339524540, 1339480320, 1339537920, 1339529580, 1339500780, 1339532820, 1339522020, 1339522680, 1339465560, 1339529940, 1339472880, 1339508520, 1339519620, 1339536000, 1339526580, 1339514940, 1339518060, 1339512420, 1339513080, 1339500120, 1339543620, 1339485660, 1339496280, 1339526520, 1339514820, 1339531800, 1339531860, 1339501320), class = c("POSIXct", "POSIXt"), tzone = "%Y-%m-%d %H:%M:%S"), Wday = structure(c(1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 3, 2, 3, 3, 3, 3, 4, 1, 1, 3, 3, 4, 4, 5, 5, 5, 1, 2, 2, 2), .Label = c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"), class = c("ordered", "factor"))), .Names = c("Prorated", "Wday"), row.names = c(NA, 30), class = "data.frame") range(my.log$Prorated) # here (n = 30): # [1] "2012-06-12 01:46:00" "2012-06-12 23:27:00" # w/ full data set (n = approx. 75000): # [1] "2012-06-12 00:00:00" "2012-06-12 23:59:00"
When I now try to plot the density with the following code ...
library("ggplot2") library("scales") p <- ggplot(my.log) + theme_bw() + geom_density(aes(Prorated, colour=Wday)) + scale_color_brewer("weekday", palette="Dark2") + scale_x_datetime("", breaks=date_breaks("4 hours"), labels=date_format("%H:00")) + opts(title="Distribution (KDE)") print(p)
... the x axis with both datasets does not start at 00:00, but at 02:00 and as a result, the entire density graph is transferred to the next day. (I wanted to post the image here, but since I'm new to SO, I am not allowed to do this. You can find it on ImageShack )
So my question is: Can qqplot () be indicated that it should start its density graph at 00:00?
I checked SO for related questions (or answered accordingly), but could not find. The only options that come to me are xlim()
or scale_x_continuous(limits=...)
. But, as I understand it, they are not correct here.
The former will discard data points (or not, since all the data in the input data.frame is already in the correct range), while the latter will simply shift the viewpoint and, as a result, cut the graph at 23:59 without adding these (now hidden) data at the beginning ... so when i use
scale_x_datetime("", breaks=date_breaks("4 hours"), labels=date_format("%H:00"), limits=c(as.POSIXct("2012-06-12 00:00:00"), as.POSIXct("2012-06-12 23:59:00"))
in the above code, the graph does not look right / does not display all the data.