polishchuk regex needs two modifications to make it work in R.
First, the backslash requires an exit. Secondly, calling strsplit requires the argument perl = TRUE to enable lookbehind.
strsplit(names, split = "\\.,|(?<=de)", perl = TRUE)
gives the answer to which Sasha asked.
Note that this still includes a dot in the name de Jong, and it does not expand for alternatives such as van, der, etc. I suggest the following alternative.
names <- "Jansen, A., Karel, A., Jong, A. de, Pietersen, K., Helsing, A. van" #split on every comma first_last <- strsplit(names, split = ",")[[1]] #rearrange into a matrix with the first column representing last names, #and the second column representing initials first_last <- matrix(first_last, byrow = TRUE, ncol = 2) #clean up: remove leading spaces and dots first_last <- gsub("^ ", "", first_last) first_last <- gsub("\\.", "", first_last) #combine columns again apply(first_last, 1, paste, collapse = ", ")