R how to remove VERY special characters in strings?

I am trying to remove some VERY special characters in my lines. I read another entry like:

but that’s not what they are looking for.

lets say that my line is as follows:

s = "who are í ½í¸€ bringing?"

I tried the following:

test = tm_map(s, function(x) iconv(enc2utf8(x), sub = "byte"))
test = iconv(s, 'UTF-8', 'ASCII')

there was none of this.

change I'm looking for a GENERAL solution! I cannot (and do not prefer) to manually identify all special characters.

also these VERY special characters MAY (not 100% sure) be the result of emoticons

Please help or direct me to the necessary posts. Thank!

+4
source share
1 answer

, , , , :

> s = "who are í ½í¸€ bringing?"
> rmSpec <- "í|½|€" # The "|" designates a logical OR in regular expressions.
> s.rem <- gsub(rmSpec, "", s) # gsub replace any matches in remSpec and replace them with "".
> s.rem
[1] "who are  ¸ bringing?"

, rmSpec. , , .

EDIT:

, , iconv, sub. . :

> s
[1] "who are í ½í¸€ bringing?"
> s2 <- iconv(s, "UTF-8", "ASCII", sub = "")
> s2
[1] "who are   bringing?"
+4

All Articles