I have two vectors in R. I want to find partial matches between them.
My details
The first of the data sets called muc, which contains 6400 street names. muc $ name looks like this:
muc$name = c("Berberichweg", "Otto-Klemperer-Weg", "Feldmeierbogen" , "Altostraße",...)
Another vector is d_vector. It contains about 1,400 names.
d_vector = "Abel", "Abendroth", "von Abercron", "Abetz", "Abicht", "Abromeit", ...
I want to find all street names that contain the name from d_vector somewhere on the street name.
First, I made some general adaptations after importing csv data (as a d variable):
d_vector <- unlist(d$name)
d_vector <- as.vector(as.matrix(d_vector))
What I tried so far
- Then I tried to find a solution with grep by turning d_vector into one long line, separated by | for RegEx-Search:
result <- unique(grep(paste(d_vector, collapse="|"), muc$Name, value=TRUE, ignore.case = TRUE))
result
But the result returns all street names.
I also tried using agrep, which reconfigured Out of memory-Error.
d_vector %in% muc$name, TRUE FALSE, .
- , , ?
- python "fuzzywuzzy" R