Fread (data.table in R) with encoding

I could not find the right answer in previous questions and answers to my problem: 1. I have a 2.3 GB csv file that contains 2.4 million lines of Hebrew text that are currently encoded in ASCII. Since we're talking about a large file, fread would be preferable, but what about encoding? Any idea how to read the csv file encoded in ASCII to avoid the well-known "embedded zero in line" error?

thanks

+5
source share
1 answer

As of August 25, the case associated with David Arenburg is closing , and functionality is included in the currently available version of data.table. Now the encoding parameter can be used when calling fread:

text <- fread(file, encoding = 'UTF-8') 

ASCII is not an explicit encoding option, but ASCII is valid UTF-8, so you can specify UTF-8 when you want to read Hebrew text.

+4
source

All Articles