My goal is to identify US states written in a character vector that has different text and converts the states to an abbreviated form. For example, North Carolina is NK. It is simple if the vector has only state names with long forms. However, my vector has different text in random places, as in the "state" example.
states <- c("Plano New Jersey", "NC", "xyz", "Alabama 02138", "Texas", "Town Iowa 99999")
From another post, I found this:
state.abb[match(states, state.name)]
but it only converts standalone Texas
> state.abb[match(states, state.name)]
[1] NA NA NA NA "TX"
not lines in New Jersey, Alabama, and Iowa.
From Quick grep with a vector pattern or match to return a list of all matches I tried:
sapply(states, grep(pattern = state.name, x = states, value = TRUE))
but
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'Alabama 02138' of mode 'function' was not found
In addition: Warning message:
In grep(pattern = state.name, x = states, value = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
And this does not work:
sapply(states, function(x) state.abb[grep(state.name, states)])
:
?
EDIT: , , , , "Plano New Jersey" "Plano NJ".
/ .