Handling special characters, for example. accents in R

I am doing some web name scrapers in a dataframe

For a name such as "Tomáš Rosický, I get the result" Tomà Š¢ Rosický "

I tried

Encoding("Tomáš Rosický") # with latin1 response 

but was not sure where to go from there to return the original name with accents. Played with icons without success

I would be pleased (and even prefer) the release of "Thomas Rosicki"

+7
source share
4 answers

You read the page encoded in UTF-8. if x is your name column, use Encoding(x) <- "UTF-8" .

+6
source

To read the file correctly, use the scan function:

 namb <- scan(file='g:/testcodering.txt', fileEncoding='UTF-8', what=character(), sep='\n', allowEscapes=T) cat(namb) 

This also works:

 namc <- readLines(con <- file('g:/testcodering.txt', "r", encoding='UTF-8')); close(con) cat(namc) 

This will read the file with the correct accents.

+2
source

A way to properly export accents:

 enc2utf8(as(dataframe$columnname, "character")) 
+2
source

You should use this:

 df$colname <- iconv(df$colname, from="UTF-8", to="LATIN1") 
+1
source

All Articles