Using the stringi package, this will be one of the options. Since your goal remains at the beginning of the lines, stri_extract_first() will work just fine. [:alpha:]{1,} indicates sequences of alphabets that contain more than one alphabet. With stri_extract_first() you can identify the first sequence of the alphabet. Similarly, you can find the first sequence of numbers with stri_extract_first(x, regex = "\\d{1,}") .
x <- c("HV5822.H4 C47 Circulating Collection, 3rd Floor", "QE511.4 .G53 1982 Circulating Collection, 3rd Floor", "TL515 .M63 Circulating Collection, 3rd Floor", "D753 .F4 Circulating Collection, 3rd Floor", "DB89.F7 D4 Circulating Collection, 3rd Floor") library(stringi) data.frame(alpha = stri_extract_first(x, regex = "[:alpha:]{1,}"), number = stri_extract_first(x, regex = "\\d{1,}"))
source share