Split data.frame by value

how can i split the following data.frame

df <- data.frame(var1 = c("a", 1, 2, 3, "a", 1, 2, 3, 4, 5, 6, "a", 1, 2), var2 = 1:14) 

to lists / groups

 a 1 1 2 2 3 3 4 a 5 1 6 2 7 3 8 4 9 5 10 6 11 a 12 1 13 2 14 

Basically, the value of β€œa” in column 1 is the tag / identifier in which I want to split the data frame. I know about the split function, but that means I need to add another column, and since, as my example shows, the size of the groups can vary. I don’t know how to automatically create such a dummy column to fit my needs.

Any ideas on this?

Greetings

Sven

+4
source share
2 answers

You can find which index vector values ​​are "a" and then create a grouping variable on it and then use split.

 df[,1] == "a" # [1] TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE #[13] FALSE FALSE cumsum(df[,1] == "a") # [1] 1 1 1 1 2 2 2 2 2 2 2 3 3 3 split(df, cumsum(df[,1] == "a")) #$`1` # var1 var2 #1 a 1 #2 1 2 #3 2 3 #4 3 4 # #$`2` # var1 var2 #5 a 5 #6 1 6 #7 2 7 #8 3 8 #9 4 9 #10 5 10 #11 6 11 # #$`3` # var1 var2 #12 a 12 #13 1 13 #14 2 14 
+8
source

You can create a loop that will go through the entire first column of the data frame and save the position of non-numeric characters in the vector. This way you will have something like:

 data <- df$var1 #this gives you a vector of the values you'll sort through positions <- c() for (i in seq(1:length(data))){ if (is.numeric(data[i]) == TRUE) { #nothing } else positions <- append(positions, i) #saves the positions of the non-numeric characters } 

With these positions, you should not have problems accessing the split data frame. It is simply a matter of using sequences between values ​​in a position vector.

0
source

All Articles