Oct 2014 update : now in version 1.0.5
fread now accepts dec=',' (and other non-second delimiters), # 917 . A new paragraph has been added to ?fread . If you are in a country that uses dec=',' then it should work. If not, you will need to read the paragraph for an additional step. If it somehow breaks dec='.' , this new feature can be disabled using options(datatable.fread.dec.experiment=FALSE) .
Previous answer ...
Matt Dole found a good job with locales. My sessionInfo first
sessionInfo() R version 2.15.2 (2012-10-26) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 LC_NUMERIC=C [5] LC_TIME=C ...
Try the following: culprit:
Sys.localeconv()["decimal_point"] decimal_point "."
Attempting to set LC_NUMERIC worked on Ubuntu (Matthew) and WinXP (me)
Sys.setlocale("LC_NUMERIC", "French_France.1252") [1] "French_France.1252" Message d'avis : In Sys.setlocale("LC_NUMERIC", "French_France.1252") : changer 'LC_NUMERIC' peut rรฉsulter en un fonctionnement รฉtrange de R
The behavior is great and changes as:
DT = fread("A,B\n3,14;123\n4,22;456\n",sep=";") str(DT) Classes 'data.table' and 'data.frame': 2 obs. of 2 variables: $ V1: num 3.14 4.22 $ V2: int 123 456
"." decimal separators are now loaded as strings (as it should be), before it was the other way around.
DT = fread("A,B\n3.14;123\n4.22;456\n",sep=";") str(DT) Classes 'data.table' and 'data.frame': 2 obs. of 2 variables: $ V1: chr "3.14" "4.22" $ V2: int 123 456
statquant
source share