How to search a row in one column in other columns of a data frame

I have a table, name it df, with 3 columns, the first is the name of the product, the second is the description of the product, and the third is a one-word string. I need to perform an operation in the entire table by creating 2 new columns (name them exist_in_title and exist_in_description) that have either 1 or 0, indicating whether the 3rd column exists in the 1st or 2nd columns. I just need to perform a 1: 1 operation, for example, calling row 1 'A', I need to check if cell A3 exists in A1, and use this data to create the exists_in_title column, and then check if A3 exists in A2, and use this data to create an exists_in_description column. Then go to line B and do the same operation. I have thousands of rows of data, so it’s unrealistic to do this 1 time,writing separate functions for each row, definitely need a function or method that will be executed through each row of the table in one shot.

I played with grepl, pmatch, str_count, but nobody seems to really do what I need. I think grepl is probably closest to what I need, here is a two-line example of code that I wrote that it would be logical to do what I would like them to but didn't work:

df$exists_in_title <- grepl(df$A3, df$A1)

df$exists_in_description <- grepl(df$A3, df$A2)

However, when I start those, I get the following message, which makes me think that it is not working properly: the "argument" pattern "has a length> 1, and only the first element will be used"

Any help on how to do this would be greatly appreciated. Thank!

+6
source share
1 answer

greplwill work with mapply:

Sample data frame:

title <- c('eggs and bacon','sausage biscuit','pancakes')
description <- c('scrambled eggs and thickcut bacon','homemade biscuit with breakfast pattie', 'stack of sourdough pancakes')
keyword <- c('bacon','sausage','sourdough')
df <- data.frame(title, description, keyword, stringsAsFactors=FALSE)

Search for matches using grepl:

df$exists_in_title <- mapply(grepl, pattern=df$keyword, x=df$title)
df$exists_in_description <- mapply(grepl, pattern=df$keyword, x=df$description)

And the results:

            title                            description   keyword exists_in_title exists_in_description
1  eggs and bacon      scrambled eggs and thickcut bacon     bacon            TRUE                  TRUE
2 sausage biscuit homemade biscuit with breakfast pattie   sausage            TRUE                 FALSE
3        pancakes            stack of sourdough pancakes sourdough           FALSE                  TRUE

Update i

dplyr stringr:

library(dplyr)
df %>% 
  rowwise() %>% 
  mutate(exists_in_title = grepl(keyword, title),
         exists_in_description = grepl(keyword, description))

library(stringr)
df %>% 
  rowwise() %>% 
  mutate(exists_in_title = str_detect(title, keyword),
         exists_in_description = str_detect(description, keyword))   

II

Map , , more from tidyverse purrr stringr:

library(tidyverse)
df %>%
  mutate(exists_in_title = unlist(Map(function(x, y) grepl(x, y), keyword, title))) %>% 
  mutate(exists_in_description = map2_lgl(description, keyword,  str_detect))
+6

All Articles