I have a timestamp, a Timelocal column in my data, which is formatted as follows:
2015-08-24T00:02:03.000Z
I usually use the following line to convert this format to convert it to a date format that I can use.
timestamp2 = "2015-08-24T00:02:03.000Z" timestamp2_formatted = strptime(timestamp2,"%Y-%m-%dT%H:%M:%S",tz="UTC")
This works great on my machine. The problem is that I am now working with a much larger amount of data. This is in a Redshift cluster, and I access it using the RPostgreSQL package. I use dplyr to manage data, as online documentation shows that it works great with RPostgreSQL.
It seems to be with the exception of converting the date format. I would like to convert character format to time format. Timelocal it was read in Redshift as "varchar". So R interprets it as a character field.
I tried the following:
library(dplyr) library(RPostgreSQL) library(lubridate)
try 1 - using simple dplyr syntax
mutate(elevate, timelocalnew = fast_strptime(timelocal, "%Y-%m-%dT%H:%M:%S",tz="UTC"))
try 2 - using dplyr syntax from another online link code
elevate %>% mutate(timelocalnew = timelocal %>% fast_strptime("%Y-%m-%dT%H:%M:%S",tz="UTC") %>% as.character()) %>% filter(!is.na(timelocalnew))
try 3 - using strptime instead of fast_strptime
elevate %>% mutate(timelocalnew = timelocal %>% strptime("%Y-%m-%dT%H:%M:%S",tz="UTC") %>% as.character()) %>% filter(!is.na(timelocalnew))
I am trying to adapt the code from here: http://www.markhneedham.com/blog/2014/12/08/r-dplyr-mutate-with-strptime-incompatible-sizewrong-result-size/
My attempts are wrong because:
Error in postgresqlExecStatement(conn, statement, ...) : RS-DBI driver: (could not Retrieve the result : ERROR: syntax error at or near "AS" LINE 1: ...CAST(STRPTIME("timelocal", '%YSuccess2048568264T%H%M ', 'UTC' AS "tz") A... ^ ) In addition: Warning messages: 1: In postgresqlQuickSQL(conn, statement, ...) : Could not create executeSELECT count(*) FROM (SELECT "timelocal", "timeutc", "zipcode", "otherdata", "country", CAST(STRPTIME("timelocal", '%Y%m%dT%H%M%S', 'UTC' AS "tz") AS TEXT) AS "timelocalnew" FROM "data") AS "master" 2: Named arguments ignored for SQL STRPTIME
It would seem strptime is incompatible with RPostgreSQL. Is this the correct interpretation? If so, does this mean that there is no means of processing date formats inside R if the data is on Redshift? I checked the RPostgreSQL package documentation and did not see anything related to specifying temporary formats.
Any advice on getting date columns formatted correctly using dplyr and RpostgreSQL would please.