Find matching strings between two vectors in R

I have two vectors in R. I want to find partial matches between them.

My details

The first of the data sets called muc, which contains 6400 street names. muc $ name looks like this:

muc$name = c("Berberichweg", "Otto-Klemperer-Weg", "Feldmeierbogen" , "Altostraße",...)

Another vector is d_vector. It contains about 1,400 names.

d_vector = "Abel", "Abendroth", "von Abercron", "Abetz", "Abicht", "Abromeit", ...

I want to find all street names that contain the name from d_vector somewhere on the street name.

First, I made some general adaptations after importing csv data (as a d variable):

d_vector <- unlist(d$name) d_vector <- as.vector(as.matrix(d_vector))

What I tried so far

  • Then I tried to find a solution with grep by turning d_vector into one long line, separated by | for RegEx-Search:

result <- unique(grep(paste(d_vector, collapse="|"), muc$Name, value=TRUE, ignore.case = TRUE)) result

But the result returns all street names.

  • I also tried using agrep, which reconfigured Out of memory-Error.

  • d_vector %in% muc$name, TRUE FALSE, .

- , , ? - python "fuzzywuzzy" R

+4
2

:

streets = c("Berberichweg", "Otto-Klemperer-Weg", "Feldmeierbogen" , "Altostraße")
streets = tolower(streets) #Lowercase all
names = c("Berber", "Weg")
names = tolower(names)

sapply(names, function (y) sapply(streets, function (x) grepl(y, x)))

#                   berber   weg
#berberichweg        TRUE  TRUE
#otto-klemperer-weg  FALSE TRUE
#feldmeierbogen      FALSE FALSE
#altostraße          FALSE FALSE
+2

, :

streets = c("Berberichweg", "Otto-Klemperer-Weg", "Feldmeierbogen", 
            "Konrad-Adenauer-Platz", "anotherThing")
patterns = c("weg", "platz")

unique(grep(paste(patterns, collapse="|"), streets, value=TRUE, ignore.case = TRUE))
[1] "Berberichweg"          "Otto-Klemperer-Weg"    "Konrad-Adenauer-Platz"

, d_vector - . class(d_vector) dput(d_vector) .

sapply , :

matches =sapply(patterns, function(p) grep(p, streets, value=TRUE, ignore.case = TRUE))
# $weg
# [1] "Berberichweg"       "Otto-Klemperer-Weg"
# 
# $platz
# [1] "Konrad-Adenauer-Platz"

unique(unlist(matches))
# [1] "Berberichweg"          "Otto-Klemperer-Weg"    "Konrad-Adenauer-Platz"
+2

All Articles