The next problem: I have two data frames where I want to map one vector from data data to data1 with a vector from data from data frame2.
data1 <- data.frame(v1 = c("horse", "duck", "bird"), v2 = c(1,2,3))
data2 <- data.frame(v1 = c("car, horse, mouse", "duck, bird", "bird"))
If the character string in data2 matches, it should be replaced with the corresponding v2 value from data1. The result is as follows:
for(i in 1:nrow(data1)) data2[,1] <- gsub(data1[i,1], data1[i,2], data2[,1], fixed=T)
data2
However, is there an idea to use a vectorized solution instead of a for loop to create better performance with huge data sets?
Thanks in advance!
- Updated:
What happens when I get that both data files do not have the same length?
data2 <- data.frame(v1 = c("car, horse, mouse", "duck, bird","bird", "bird"))
When I use this solution:
data2$v1 <- mapply(sub, data1$v1, data1$v2, data2$v1)
Then I get the following warning message:
1: mapply (sub, data1 $v1, data1 $v2, data2 $v1): 2: mapply (sub, data1 $v1, data1 $v2, data2 $v1):
mgsub ! !