Select groups with more than one single value

I have data with a grouping variable ("from") and values โ€‹โ€‹("number"):

from number 1 1 1 1 2 1 2 2 3 2 3 2 

I want to multiply data and select groups that have two or more unique values. According to my data, only group 2 has more than one separate "number", so this is the desired result:

 from number 2 1 2 2 
+6
source share
3 answers

A few possibilities here are my favorites

 library(data.table) setDT(df)[, if(+var(number)) .SD, by = from] # from number # 1: 2 1 # 2: 2 2 

Basically, for each group we check if there is any variance, if TRUE , then we return the values โ€‹โ€‹of the group


With base R, I would go with

 df[as.logical(with(df, ave(number, from, FUN = var))), ] # from number # 3 2 1 # 4 2 2 

Edit : for non-numeric data, you can try the new uniqueN function for the devel data.table version (or use length(unique(number)) > 1 instead

 setDT(df)[, if(uniqueN(number) > 1) .SD, by = from] 
+6
source

You can try

  library(dplyr) df1 %>% group_by(from) %>% filter(n_distinct(number)>1) # from number #1 2 1 #2 2 2 

Or using base R

  indx <- rowSums(!!table(df1))>1 subset(df1, from %in% names(indx)[indx]) # from number #3 2 1 #4 2 2 

or

  df1[with(df1, !ave(number, from, FUN=anyDuplicated)),] # from number #3 2 1 #4 2 2 
+4
source

Using the concept of variance shared by David , but in a dplyr way:

 library(dplyr) df %>% group_by(from) %>% mutate(variance=var(number)) %>% filter(variance!=0) %>% select(from,number) #Source: local data frame [2 x 2] #Groups: from #from number #1 2 1 #2 2 2 
+1
source

All Articles