Extract matching words from a string

I have a database structure - short version below

structure(list(sex1 = c("totalmaleglobal", "totalfemaleglobal", "totalglobal", "totalfemaleGSK", "totalfemaleglobal", "totalfemaleUN")), .Names = "sex1", row.names = c(NA, 6L), class="data.frame") 

I want to extract the words "total", "totalmale", "totalfemale"

How to do it?

I tried regex with the following code

 pattern1=c("total") pattern2=c("totalmale") pattern3=c("totalfemale") daly$sex <- str_extract(daly$sex1,pattern1) daly$sex <- str_extract(daly$sex1,pattern2) daly$sex <- str_extract(daly$sex1,pattern3) 

But that gives me NA.

+5
source share
4 answers

Can

 library(stringr) daly$sex <- str_extract(daly$sex1,paste(rev(mget(ls(pattern = "pattern\\d+"))), collapse="|")) daly # sex1 sex # 1 totalmaleglobal totalmale # 2 totalfemaleglobal totalfemale # 3 totalglobal total # 4 totalfemaleGSK totalfemale # 5 totalfemaleglobal totalfemale # 6 totalfemaleUN totalfemale 
+2
source

Two steps with gsub ,

 v2 <- gsub(paste(v1, collapse='|'), '', d1$sex1) gsub(paste(v2, collapse='|'), '', d1$sex1) #[1] "totalmale" "totalfemale" "total" "totalfemale" "totalfemale" "totalfemale" 

Where

 v1 <- c('total', 'totalmale', 'totalfemale') 
+2
source

try the following:

 test = structure(list(sex1 = c("totalmaleglobal", "totalfemaleglobal", "totalglobal", "totalfemaleGSK", "totalfemaleglobal", "totalfemaleUN")), .Names = "sex1", row.names = c(NA, 6L), class="data.frame") total = grep("total", test[[1]], perl=TRUE, value=TRUE) totalmale = grep("totalmale", test[[1]], perl=TRUE, value=TRUE) totalfemale = grep("totalfemale", test[[1]], perl=TRUE, value=TRUE) print(total) print(totalmale) print(totalfemale) 
+1
source

We could also make sapply and grepl (in the R base) according to the desired patterns ( s1 vector) as follows:

 x <- sapply(s1,function(x) grepl(x, d1$sex1)) colnames(x)[max.col(x, ties.method = "first")] # [1] "totalmale" "totalfemale" "total" "totalfemale" "totalfemale" "totalfemale" 

Where

 s1 <- c("totalmale", "totalfemale", "total") 
0
source

All Articles