All these dates that Ive manipulated in the Execute R module in Azure Machine Learning are written as empty in the output, that is, these date columns exist, but there is no value in these columns.
The source variables containing the date information that they read into the data frame have two different date formats. They look like this:
usage$Date1=c('8/6/2015' '8/20/2015' '7/9/2015') usage$Date2=c('4/16/2015 0:00', '7/1/2015 0:00', '7/1/2015 0:00')
I checked the log file in AML and AML could not find the local time zone. Warnings in the log file: [ModuleOutput] 1: In strptime (x, format, tz = tz): [ModuleOutput] cannot determine the current time zone "C": [ModuleOutput] set the TZ environment variable [ModuleOutput] [ModuleOutput] 2: In strptime (x, format, tz = tz): unknown timezone "localtime"
I mentioned another answer regarding setting the default time zone for strptime here
unknown timezone name in R strptime / as.POSIXct
I modified my code to explicitly define a global environment variable.
Sys.setenv(TZ='GMT') ####Data frame usage cleanup, format and labeling usage<-as.data.frame(usage) usage$Date1<-as.character(usage$Date1) usage$Date1<-as.POSIXct(usage$Date1, "%m/%d/%Y",tz="GMT") usage$Date1<-format(usage$Date1, "%m/%d/%Y") usage$Date1<-as.Date(usage$Date1, "%m/%d/%Y") usage<-as.data.frame(usage) usage$Date2<- as.POSIXct(usage$Date2, "%m/%d/%Y",tz="GMT") usage$Date2<- format(usage$Date2,"%m/%d/%Y") usage$Date2<-as.Date(usage$Date2, "%m/%d/%Y") usage<-as.data.frame(usage)
The problem persists - the result of AzureML does not write these variables, but writes these columns as spaces.
(This code works in studio R, where I assume that local time is taken from the system.)
After reading two blog posts on this issue, it seems that Azure ML does not support some date formats:
http://blogs.msdn.com/b/andreasderuiter/archive/2015/02/03/troubleshooting-error-1000-rpackage-library-exception-failed-to-convert-robject-to-dataset-when-running- r-scripts-in-azure-ml.aspx
http://www.mikelanzetta.com/2015/01/data-cleaning-with-azureml-and-r-dates/
So, I tried converting to POSIXct before sending it to the output stream, which I did as follows: tenantusage $ Date1 = as.POSIXct (tenantusage $ Date1, "% m /% d /% Y", tz = "EST5EDT") ; tenantusage $ Date2 = as.POSIXct (tenantusage $ Date2, "% m /% d /% Y", tz = "EST5EDT");
But faced with the same problem. The information in these variables refuses to write to the output. The columns Date1 and Date2 are empty.
Please advise!
thanks