How to use the Sub function in R

I am reading the csv file " dopers" in R.

dopers <- read.csv(file="generalDoping_alldata2.csv", head=TRUE,sep=",")

After reading the file, I have to do some data cleaning. For example, in a column countryif it says

"United States" or "United States"

I would like to replace it with "USA"

I want to make sure that if the word " United States "or "United State ", even their code should work. I want to say that even if there is a symbol before and after "United States", it is replaced by "USA". I understand that for this purpose we can use a function sub(). I looked online and found this, but I don’t understand what it is doing "^" "&" "*" ".". Can someone explain.

dopers$Country = sub("^UNITED STATES.*$", "USA", dopers$Country)
+4
source share
1 answer

,

s <- c(" United States", " United States ", "United States ")

,

pat <- "^.*United State.*$"

^ $ , . , * ( ). ,

pat <- "^[ ]*United State[ ]*$" # only ignores spaces
pat <- "^.*(United State|USA).*$" # only matches "  USA" etc.

gsub(pat, "USA", s)
+5

All Articles