How can I get the year and month when the day is not valid without fixing the day itself?

I have some data that look something like this:

require(zoo) X <- rbind(c(date='20111001', fmt='%Y%m%d'), c('20111031', '%Y%m%d'), c('201110', '%Y%m'), c('102011', '%m%Y'), c('31/10/2011', '%d/%m/%Y'), c('20111000', '%Y%m%d')) print(X) # date fmt # [1,] "20111001" "%Y%m%d" # [2,] "20111031" "%Y%m%d" # [3,] "201110" "%Y%m" # [4,] "102011" "%m%Y" # [5,] "31/10/2011" "%d/%m/%Y" # [6,] "20111000" "%Y%m%d" 

I want only a year and a month. I don’t need this day, so I don’t worry that the last day is not valid. R, unfortunately, is:

 mapply(as.yearmon, X[, 'date'], X[, 'fmt'], SIMPLIFY=FALSE) # $`20111001` # [1] "Oct 2011" # $`20111031` # [1] "Oct 2011" # $`201110` # [1] "Oct 2011" # $`102011` # [1] "Oct 2011" # $`31/10/2011` # [1] "Oct 2011" # $`20111000` # Error in charToDate(x) : # character string is not in a standard unambiguous format 

I know that the usual answer is to fix the daytime part of the date, for example. using paste(x, '01', sep='') . I don’t think this will work here because I don’t know in advance what the date format will be, and therefore I cannot set the day without converting to some date object.

+4
source share
3 answers

Assuming that I am always given a date (and never a time), and that any illegal "day" is less than 61, I can guarantee the legal date as follows, treating the delivered day as "seconds" and replacing the delivered day with 1.

 require(stringr) safe_date <- str_c('01', X[, 'date']) safe_fmt <- str_c('%d', str_replace(X[, 'fmt'], '%d', '%S')) mapply(as.yearmon, safe_date, safe_fmt, SIMPLIFY=FALSE) # $`0120111001` # [1] "Oct 2011" # $`0120111031` # [1] "Oct 2011" # $`01201110` # [1] "Oct 2011" # $`01102011` # [1] "Oct 2011" # $`0131/10/2011` # [1] "Oct 2011" # $`0120111000` # [1] "Oct 2011" 
0
source

Assuming the month always follows the year and always has two characters in your date . Why not just extract the information using substr . Perhaps something like:

 lapply(X[,'date'], function(x) paste(month.abb[as.numeric(substr(x, 5, 6))], substr(x, 1, 4)) ) 
+5
source

You do not need to specify the day in your format if you do not need it. Read carefully ?strptime . The second paragraph in the Details section reads:

Each line of input is processed as necessary for the specified format: any trailing characters are ignored.

So set up your format and everything should work.

 X <- rbind(c(date='20111001', fmt='%Y%m'), c('20111031', '%Y%m'), c('201110', '%Y%m'), c('102011', '%m%Y'), c('20111000', '%Y%m')) mapply(as.yearmon, X[, 'date'], X[, 'fmt'], SIMPLIFY=FALSE) 
+3
source

All Articles