Here are two solutions that use strsplit but differ in how they are separated:
1) split on newline Delete all lines of a newline giving s1 , and then add a new line after every third character giving s2 . Separate s2 by newlines and replace each occurrence of three consecutive spaces with an empty string.
Split <- function(string) { s1 <- gsub("\n", "", string) s2 <- gsub("(.{3})", "\\1\n", s1) spl <- strsplit(s2, "\n") lapply(spl, function(s) replace(s, s == " ", "")) } # test string <- "abc\n def\nghi jkl" Split(string) ## [[1]] ## [1] "abc" "" "def" "ghi" "" "jkl"
2) divided by zero width 3 char regexp Remove new lines and split using the specified regular expression. Finally, replace each subsequent three spaces with an empty string.
Split2 <- function(string) { s1 <- gsub("\n", "", string) spl <- strsplit(s1, "(?<=...)", perl = TRUE) lapply(spl, function(s) replace(s, s == " ", "")) } # test string <- "abc\n def\nghi jkl" Split2(string) ## [[1]] ## [1] "abc" "" "def" "ghi" "" "jkl"
Note: 1 . Note that other answers to this question do not work for the next input line (which has two empty fields in a row), but the answers here correctly recognize two empty 3 character fields in a row after the abc field:
string2 <- "abc\n def\nghi jkl" # 6 spaces before d, 3 spaces before j Split(string2) ## [[1]] ## [1] "abc" "" "" "def" "ghi" "" "jkl" Split2(string2) ## [[1]] ## [1] "abc" "" "" "def" "ghi" "" "jkl"
Note 2: The two above solutions can also be well expressed using the magrittr pipeline:
library(magrittr) string %>% gsub(pattern = "\n", replacement = "") %>% gsub(pattern = "(.{3})", replacement = "\\1\n") %>% strsplit("\n") %>% lapply(function(s) replace(s, s == " ", "")) ## [[1]] ## [1] "abc" "" "def" "ghi" "" "jkl" library(magrittr) string %>% gsub(pattern = "\n", replacement = "") %>% strsplit("(?<=...)", perl = TRUE) %>% lapply(function(s) replace(s, s == " ", "")) ## [[1]] ## [1] "abc" "" "def" "ghi" "" "jkl"