Remove spaces from strsplit in R

> dc1 V1 V2 1 20140211-0100 |Box 2 20140211-1782 |Office|Ball 3 20140211-1783 |Office 4 20140211-1784 |Office 5 20140221-0756 |Box 6 20140203-0418 |Box > strsplit(as.character(dc1[,2]),"^\\|") [[1]] [1] "" "Box" [[2]] [1] "" "Office" "Ball" [[3]] [1] "" "Office" [[4]] [1] "" "Office" [[5]] [1] "" "Box" [[6]] [1] "" "Box" 

How to remove clean ("") from strsplit results. The result should look like this:

 [[1]] [1] "Box" 
 [[2]] [1] "Office" "Ball" 
+7
r
source share
6 answers

You can check the use of lapply on your list. I changed the definition of your strsplit according to your intended output.

 dc1 <- read.table(text = 'V1 V2 1 20140211-0100 |Box 2 20140211-1782 |Office|Ball 3 20140211-1783 |Office 4 20140211-1784 |Office 5 20140221-0756 |Box 6 20140203-0418 |Box', header = TRUE) out <- strsplit(as.character(dc1[,2]),"\\|") > lapply(out, function(x){x[!x ==""]}) [[1]] [1] "Box" [[2]] [1] "Office" "Ball" [[3]] [1] "Office" [[4]] [1] "Office" [[5]] [1] "Box" [[6]] [1] "Box" 
+7
source share

I do not have a global solution, but for your example you can try:

strsplit(sub("^\\|", "", as.character(dc1[,2])),"\\|")

He removes the first | (this is what regex says "^\\|" ), which is the reason for "" , before doing the split.

+3
source share

You can use:

 library(stringr) str_extract_all(dc1[,2], "[[:alpha:]]+") [[1]] [1] "Box" [[2]] [1] "Office" "Ball" [[3]] [1] "Office" [[4]] [1] "Office" [[5]] [1] "Box" [[6]] [1] "Box" 
+3
source share

In this case, you can simply remove the first element of each vector by calling "[" in sapply

 > sapply(strsplit(as.character(dc1[,2]), "\\|"), "[", -1) # [[1]] # [1] "Box" # [[2]] # [1] "Office" "Ball" # [[3]] # [1] "Office" # [[4]] # [1] "Office" # [[5]] # [1] "Box" # [[6]] # [1] "Box" 
+2
source share

Another method uses nzchar() after the result of strsplit() been canceled:

 out <- unlist(strsplit(as.character(dc1[,2]),"\\|")) out[nzchar(x=out)] # removes the extraneous "" marks 
+2
source share
 library("stringr") lapply(str_split(dc1$V2, "\\|"), function(x) x[-1]) [[1]] [1] "Box" [[2]] [1] "Office" "Ball" [[3]] [1] "Office" [[4]] [1] "Office" [[5]] [1] "Box" [[6]] [1] "Box" 
0
source share

All Articles