Why does dmy () in the lubridate package not work with NA? What is a good workaround?

I came across a peculiar behavior in the lubridate package: dmy(NA) error instead of just returning NA. This causes me problems when I want to convert a column with some elements that are NA and some date strings that usually convert without problems.

Here is a minimal example:

 library(lubridate) df <- data.frame(ID=letters[1:5], Datum=c("01.01.1990", NA, "11.01.1990", NA, "01.02.1990")) df_copy <- df #Question 1: Why does dmy(NA) not return NA, but throws an error? df$Datum <- dmy(df$Datum) Error in function (..., sep = " ", collapse = NULL) : invalid separator df <- df_copy #Question 2: What a work around? #1. Idea: Only convert those elements that are not NAs #RHS works, but assigning that to the LHS doesn't work (Most likely problem:: #column "Datum" is still of class factor, while the RHS is of class POSIXct) df[!is.na(df$Datum), "Datum"] <- dmy(df[!is.na(df$Datum), "Datum"]) Using date format %d.%m.%Y. Warning message: In `[<-.factor`(`*tmp*`, iseq, value = c(NA_integer_, NA_integer_, : invalid factor level, NAs generated df #Only NAs, apparently problem with class of column "Datum" ID Datum 1 a <NA> 2 b <NA> 3 c <NA> 4 d <NA> 5 e <NA> df <- df_copy #2. Idea: Use mapply and apply dmy only to those elements that are not NA df[, "Datum"] <- mapply(function(x) {if (is.na(x)) { return(NA) } else { return(dmy(x)) }}, df$Datum) df #Meaningless numbers returned instead of date-objects ID Datum 1 a 631152000 2 b NA 3 c 632016000 4 d NA 5 e 633830400 

To summarize, I have two questions: 1) Why does dmy (NA) not work? Based on most of the other functions, I would suggest that good programming practice is that each conversion (e.g. dmy ()) from NA returns NA again (like 2 + NA does)? If this behavior is intended, how do I convert a data.frame column that enables NA through the dmy() function?

+8
r lubridate
source share
2 answers

Error in function (..., sep = " ", collapse = NULL) : invalid separator called by lubridate:::guess_format() . NA is passed as sep when calling paste() , especially in fmts <- unlist(mlply(with_seps, paste)) . You can improve lubridate:::guess_format() to fix this.

Otherwise, could you just change NA to characters ( "NA" )?

 require(lubridate) df <- data.frame(ID=letters[1:5], Datum=c("01.01.1990", "NA", "11.01.1990", "NA", "01.02.1990")) #NAs are quoted df_copy <- df df$Datum <- dmy(df$Datum) 
+5
source share

Since your dates are in a fairly straightforward format, it would be much easier to use as.Date and specify the appropriate format argument:

 df$Date <- as.Date(df$Datum, format="%d.%m.%Y") df ID Datum Date 1 a 01.01.1990 1990-01-01 2 b <NA> <NA> 3 c 11.01.1990 1990-01-11 4 d <NA> <NA> 5 e 01.02.1990 1990-02-01 

For a list of format codes used by as.Date , see ?strptime

+2
source share

All Articles