Data difference in `as.POSIXct` with Excel

My actual data looks like

8/8/2013 15:10 7/26/2013 10:30 7/11/2013 14:20 3/28/2013 16:15 3/18/2013 15:50 

When I read from an excel file, R reads like

 41494.63 41481.44 41466.60 41361.68 41351.66 

So, I used as.POSIXct(as.numeric(x[1:5])*86400, origin="1899-12-30",tz="GMT") , and I got

 2013-08-08 15:07:12 GMT 2013-07-26 10:33:36 GMT 2013-07-11 14:24:00 GMT 2013-03-28 16:19:12 GMT 2013-03-18 15:50:24 GMT 

Why is there a time difference? How to overcome it?

+5
source share
4 answers

The problem is that either R from Excel rounds the number to two decimal places. When you convert, for example, a cell from 8/8/2013 15:10 to text formatting (in Excel on Mac OSX), you get the number 41494.63194 .

Using:

 as.POSIXct(41494.63194*86400, origin="1899-12-30",tz="GMT") 

he will give you:

 [1] "2013-08-08 15:09:59 GMT" 

This is 1 second from the original date (which also indicates that 41494.63194 rounded to five decimal places).

Probably the best solution for this is to export your excel file to .csv or a tab-delimited .txt file and then read it into R. This gives me at least the correct dates:

 > df datum 1 8/8/2013 15:10 2 7/26/2013 10:30 3 7/11/2013 14:20 4 3/28/2013 16:15 5 3/18/2013 15:50 
+5
source

Considering

 x <- c("8/8/2013 15:10","7/26/2013 10:30","7/11/2013 14:20","3/28/2013 16:15","3/18/2013 15:50") 

(which reads as a character vector)

to try

 x <- as.POSIXct(x, format = "%m/%d/%Y %H:%M", tz = "GMT") 

It reads correctly as a POSIXct vector for me.

+3
source

Perhaps this is a question of how R reads the data. Just the lubridate example seems to work well.

 x <- "8/8/2013 15:10" library(lubridate) dmy_hm(x, tz = "GMT") [1] "2013-08-08 15:10:00 GMT" 
+2
source

Here's how it works on a Windows system. Here's what the original Excel 2010 file looks like:

 date num secs constant Rtime (mm/dd/yyyy) (in Excel) (num*86400) (Windows) (secs-constant) 08/08/2013 15:10 41494.63 3585136200 2209161600 1375974600 07/26/2013 10:30 41481.44 3583996200 2209161600 1374834600 11/07/2013 14:20 41585.60 3592995600 2209161600 1383834000 03/28/2013 16:15 41361.68 3573648900 2209161600 1364487300 03/18/2013 15:50 41351.66 3572783400 2209161600 1363621800 Rtime <- c(1375974600,1374834600,1383834000,1364487300,1363621800) as.POSIXct(Rtime,origin="1970-01-01",tz="GMT") #[1] "2013-08-08 15:10:00 GMT" "2013-07-26 10:30:00 GMT" #[3] "2013-11-07 14:20:00 GMT" "2013-03-28 16:15:00 GMT" #[5] "2013-03-18 15:50:00 GMT" 

Why is this constant? Firstly, because Excel and Office are usually messy when dealing with dates. Seriously, look here: Why is 1899-12-30 a zero date in Access / SQL Server instead of 12/31?

2209161600 - the difference in seconds between the start of POSIXct from the beginning of 1970-01-01 and 1899-12-30, which is 0 points in Excel on Windows.

 dput(as.POSIXct(2209161600,origin="1899-12-30",tz="GMT")) #structure(0, tzone = "GMT", class = c("POSIXct", "POSIXt")) 
+1
source

All Articles