How to adjust x axis in ggplot density graph?

I am trying to get an hourly frequency overview of my wrt workday data. Therefore, I reduced the different dates to one day, so that only the time is different and adds a column that represents the day of the week as an ordered factor.

The following is the extraction of my data:

my.log <- structure(list(Prorated = structure(c(1339535400, 1339536540, 1339524540, 1339480320, 1339537920, 1339529580, 1339500780, 1339532820, 1339522020, 1339522680, 1339465560, 1339529940, 1339472880, 1339508520, 1339519620, 1339536000, 1339526580, 1339514940, 1339518060, 1339512420, 1339513080, 1339500120, 1339543620, 1339485660, 1339496280, 1339526520, 1339514820, 1339531800, 1339531860, 1339501320), class = c("POSIXct", "POSIXt"), tzone = "%Y-%m-%d %H:%M:%S"), Wday = structure(c(1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 3, 2, 3, 3, 3, 3, 4, 1, 1, 3, 3, 4, 4, 5, 5, 5, 1, 2, 2, 2), .Label = c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"), class = c("ordered", "factor"))), .Names = c("Prorated", "Wday"), row.names = c(NA, 30), class = "data.frame") range(my.log$Prorated) # here (n = 30): # [1] "2012-06-12 01:46:00" "2012-06-12 23:27:00" # w/ full data set (n = approx. 75000): # [1] "2012-06-12 00:00:00" "2012-06-12 23:59:00" 

When I now try to plot the density with the following code ...

 library("ggplot2") library("scales") p <- ggplot(my.log) + theme_bw() + geom_density(aes(Prorated, colour=Wday)) + scale_color_brewer("weekday", palette="Dark2") + scale_x_datetime("", breaks=date_breaks("4 hours"), labels=date_format("%H:00")) + opts(title="Distribution (KDE)") print(p) 

... the x axis with both datasets does not start at 00:00, but at 02:00 and as a result, the entire density graph is transferred to the next day. (I wanted to post the image here, but since I'm new to SO, I am not allowed to do this. You can find it on ImageShack )

So my question is: Can qqplot () be indicated that it should start its density graph at 00:00?

I checked SO for related questions (or answered accordingly), but could not find. The only options that come to me are xlim() or scale_x_continuous(limits=...) . But, as I understand it, they are not correct here.

The former will discard data points (or not, since all the data in the input data.frame is already in the correct range), while the latter will simply shift the viewpoint and, as a result, cut the graph at 23:59 without adding these (now hidden) data at the beginning ... so when i use

 scale_x_datetime("", breaks=date_breaks("4 hours"), labels=date_format("%H:00"), limits=c(as.POSIXct("2012-06-12 00:00:00"), as.POSIXct("2012-06-12 23:59:00")) 

in the above code, the graph does not look right / does not display all the data.

+4
source share
1 answer

This is a time zone issue. See This related question: What is the appropriate time domain argument syntax for scale_datetime () in ggplot 0.9.0

You can get around this by changing the labels argument to function(x) format(x, "%H:00", tz="UTC") (or maybe another suitable time zone). I had to change your example data because it had the tzone attribute with the wrong display for the POSIXt column of the data frame.

 ggplot(my.log) + theme_bw() + geom_density(aes(Prorated, colour=Wday)) + scale_color_brewer("weekday", palette="Dark2") + scale_x_datetime("", breaks=date_breaks("4 hours"), labels=function(x) format(x,"%H:00",tz="UTC")) + opts(title="Distribution (KDE)") 

enter image description here

+4
source

All Articles