Can fread from "data.table" force "." what is the meaning of sep ?
I am trying to use fread to speed up concat.split functions in "splitstackshape" . See this Gist for a general approach that I take, and this question for why I want to make a switch.
The problem I am facing is handling the dot ( "." ) As the value for sep . Whenever I do this, I get an “unexpected character” error.
The following simplified example demonstrates the problem.
library(data.table) y <- paste("192.168.1.", 1:10, sep = "") x1 <- tempfile() writeLines(y, x1) fread(x1, sep = ".", header = FALSE) # Error in fread(x1, sep = ".", header = FALSE) : Unexpected character ( # 192) ending field 2 of line 1
The workaround that I have in my current function is to replace the "." to another character, which I hope is not present in the source data, say "|" but it seems risky to me because I cannot predict what is in someone else. Here's a workaround in action.
x2 <- tempfile() z <- gsub(".", "|", y, fixed=TRUE) writeLines(z, x2) fread(x2, sep = "|", header = FALSE) # V1 V2 V3 V4 # 1: 192 168 1 1 # 2: 192 168 1 2 # 3: 192 168 1 3 # 4: 192 168 1 4 # 5: 192 168 1 5 # 6: 192 168 1 6 # 7: 192 168 1 7 # 8: 192 168 1 8 # 9: 192 168 1 9 # 10: 192 168 1 10
For the purposes of this question, suppose the data is balanced (each line will have the same number of " sep " characters). I know the use of "." as a separator is not a good idea, but I'm just trying to explain what other users may have in their datasets, based on other questions I answered here on SO.
r data.table fread splitstackshape
A5C1D2H2I1M1N2O1R2T1
source share