How to convert character date time to use with dplyr and RPostgreSQL?

I have a timestamp, a Timelocal column in my data, which is formatted as follows:

 2015-08-24T00:02:03.000Z 

I usually use the following line to convert this format to convert it to a date format that I can use.

 timestamp2 = "2015-08-24T00:02:03.000Z" timestamp2_formatted = strptime(timestamp2,"%Y-%m-%dT%H:%M:%S",tz="UTC") # also works for dataframes (my main use of it) df$TimeNew = strptime(df$TimeLocal,"%Y-%m-%dT%H:%M:%S",tz="UTC") 

This works great on my machine. The problem is that I am now working with a much larger amount of data. This is in a Redshift cluster, and I access it using the RPostgreSQL package. I use dplyr to manage data, as online documentation shows that it works great with RPostgreSQL.

It seems to be with the exception of converting the date format. I would like to convert character format to time format. Timelocal it was read in Redshift as "varchar". So R interprets it as a character field.

I tried the following:

 library(dplyr) library(RPostgreSQL) library(lubridate) 

try 1 - using simple dplyr syntax

 mutate(elevate, timelocalnew = fast_strptime(timelocal, "%Y-%m-%dT%H:%M:%S",tz="UTC")) 

try 2 - using dplyr syntax from another online link code

 elevate %>% mutate(timelocalnew = timelocal %>% fast_strptime("%Y-%m-%dT%H:%M:%S",tz="UTC") %>% as.character()) %>% filter(!is.na(timelocalnew)) 

try 3 - using strptime instead of fast_strptime

 elevate %>% mutate(timelocalnew = timelocal %>% strptime("%Y-%m-%dT%H:%M:%S",tz="UTC") %>% as.character()) %>% filter(!is.na(timelocalnew)) 

I am trying to adapt the code from here: http://www.markhneedham.com/blog/2014/12/08/r-dplyr-mutate-with-strptime-incompatible-sizewrong-result-size/

My attempts are wrong because:

 Error in postgresqlExecStatement(conn, statement, ...) : RS-DBI driver: (could not Retrieve the result : ERROR: syntax error at or near "AS" LINE 1: ...CAST(STRPTIME("timelocal", '%YSuccess2048568264T%H%M     ', 'UTC' AS "tz") A... ^ ) In addition: Warning messages: 1: In postgresqlQuickSQL(conn, statement, ...) : Could not create executeSELECT count(*) FROM (SELECT "timelocal", "timeutc", "zipcode", "otherdata", "country", CAST(STRPTIME("timelocal", '%Y%m%dT%H%M%S', 'UTC' AS "tz") AS TEXT) AS "timelocalnew" FROM "data") AS "master" 2: Named arguments ignored for SQL STRPTIME 

It would seem strptime is incompatible with RPostgreSQL. Is this the correct interpretation? If so, does this mean that there is no means of processing date formats inside R if the data is on Redshift? I checked the RPostgreSQL package documentation and did not see anything related to specifying temporary formats.

Any advice on getting date columns formatted correctly using dplyr and RpostgreSQL would please.

+7
datetime r postgresql amazon-redshift dplyr
source share
2 answers

Is the following being done?

 as.Date(strptime(timelocal,format = "%YYYY/%MM/%DD %H:%M:%OS"),tz="UTC") 
0
source share

Traditional R functions will not work here.
You should go with the SQL translation that is being developed in the latest versions of dplyr and dbplyr .
The following worked for me:

 library(dbplyr) mutate(date = to_date(timestamp2, 'YYYY-MM-DD')) 

Notice I'm using AWS Redshift.

0
source share

All Articles