Deleting characters after the EURO character in R

I have a euro symbol stored in the "euro" variable:

euro <- "\u20AC" euro #[1] "€" 

And the "eurosearch" variable contains "services defined in this SOW at a price of € 15,896.80 (if executed for".

 eurosearch [1] "services as defined in this SOW at a price of € 15,896.80 (if executed fro" 

I want the characters after the Euro to be "15,896.80 (if executed for" I use this code:

 gsub("^.*[euro]","",eurosearch) 

But I get an empty result. How can I get the expected result?

+1
source share
2 answers

You can use variables in the template by simply concatenating strings using paste0 :

 euro <- "€" eurosearch <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro" sub(paste0("^.*", gsub("([^A-Za-z_0-9])", "\\\\\\1", euro), "\\s*(\\S+).*"), "\\1", eurosearch) euro <- "$" eurosearch <- "services as defined in this SOW at a price of $ 25,196.4 (if executed fro" sub(paste0("^.*", gsub("([^A-Za-z_0-9])", "\\\\\\1", euro), "\\s*(\\S+).*"), "\\1", eurosearch) 

See CodingGround Demo

Please note that with gsub("([^A-Za-z_0-9])", "\\\\\\1", euro) I avoid any characters other than words, so $ can be treated as a literal , not a special regular expression metacharacter (taken from this SO post ).

+1
source

Use regmatches present in r base or str_extarct in stringr etc.

 > x <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro" > regmatches(x, regexpr("(?<=€ )\\S+", x, perl=T)) [1] "15,896.80" 

or

 > gsub("€ (\\S+)|.", "\\1", x) [1] "15,896.80" 

or

Use of variables.

 euro <- "\u20AC" gsub(paste(euro , "(\\S+)|."), "\\1", x) 

If this answer using variables will not work for you, you need to set the encoding,

 gsub(paste(euro , "(\\S+)|."), "\\1", `Encoding<-`(x, "UTF8")) 

A source

+4
source

All Articles