R + ggplot2: how to hide missing dates from the x axis?

Let's say we have the following simple data frame of date value pairs where some dates are missing in the sequence (i.e. from January 12 to January 14). When I draw the dots, it shows these missing dates on the x axis, but there are no dots matching these dates. I want these missing dates not to be displayed on the x-axis, so that the point-sequence has no gaps. Any suggestions on how to do this? Thank you

dts <- c(as.Date( c('2011-01-10', '2011-01-11', '2011-01-15', '2011-01-16'))) df <- data.frame(dt = dts, val = seq_along(dts)) ggplot(df, aes(dt,val)) + geom_point() + scale_x_date(format = '%d%b', major='days') 

enter image description here

+8
r plot ggplot2
source share
3 answers

Turn the date data into a factor then. Ggplot currently interprets the data in the sense that you said that the data is in a continuous date scale. You do not want this scale, you want a categorical scale:

 require(ggplot2) dts <- as.Date( c('2011-01-10', '2011-01-11', '2011-01-15', '2011-01-16')) df <- data.frame(dt = dts, val = seq_along(dts)) ggplot(df, aes(dt,val)) + geom_point() + scale_x_date(format = '%d%b', major='days') 

against

 df <- data.frame(dt = factor(format(dts, format = '%d%b')), val = seq_along(dts)) ggplot(df, aes(dt,val)) + geom_point() 

which produces: enter image description here

Is this what you wanted?

+8
source share

I made a package that does this. It is called bdscale , and it is on CRAN and github . Shameless fork.

Repeat your example:

 > library(bdscale) > library(ggplot2) > library(scales) > dts <- as.Date( c('2011-01-10', '2011-01-11', '2011-01-15', '2011-01-16')) > ggplot(df, aes(x=dt, y=val)) + geom_point() + scale_x_bd(business.dates=dts, labels=date_format('%d%b')) 

replicate example

But you probably want to load known valid dates, and then draw your data using valid dates on the x axis:

 > nyse <- bdscale::yahoo('SPY') # get valid dates from SPY prices > dts <- as.Date('2011-01-10') + 1:10 > df <- data.frame(dt=dts, val=seq_along(dts)) > ggplot(df, aes(x=dt, y=val)) + geom_point() + scale_x_bd(business.dates=nyse, labels=date_format('%d%b'), max.major.breaks=10) Warning message: Removed 3 rows containing missing values (geom_point). 

better

A warning tells you that it has deleted three dates:

  • 15th = Saturday
  • 16th = sunday
  • 17th = MLK Day
+7
source share

First question: why do you want to do this? It makes no sense to show a graph based on coordinates if your axes are not coordinates. If you really want to do this, you can convert it to a factor. Be careful when placing an order:

 dts <- c(as.Date( c('31-10-2011', '01-11-2011', '02-11-2011', '05-11-2011'),format="%d-%m-%Y")) dtsf <- format(dts, format= '%d%b') df <- data.frame(dt=ordered(dtsf,levels=dtsf),val=seq_along(dts)) ggplot(df, aes(dt,val)) + geom_point() 

enter image description here

You must be careful with factors, since the order is arbitrary in the coefficient, unless you make it an ordered factor. Since factors are sorted alphabetically by default, you may encounter some date formats. So be careful what you do. If you do not take into account the order, you get:

 df <- data.frame(dt=factor(dtsf),val=seq_along(dts)) ggplot(df, aes(dt,val)) + geom_point() 

enter image description here

+5
source share

All Articles