How to extract specific rows in R?

I would like to highlight specific rows from a data framework into a new framework using R. I have two columns: Cityand Household. To detect movement, I want a new data block with households that do not have one city.

For example, if a housewife appears 3 times, at least one city is different from the others, I keep it. Otherwise, I will delete 3 rows of this household.

    City      Household
   Paris              A
   Paris              A
    Nice              A
  Limoge              B
  Limoge              B
Toulouse              C
   Paris              C

Here I want to save only "Home" Aand "Home" C.

+4
source share
2 answers

Dplyr solution: calculate the length of unique cities for each household and save only those with a length> 1

library(dplyr)
df <- data.frame(city=c("Paris","Paris","Nice","Limoge","Limoge","Toulouse","Paris"),
                 household =c(rep("A",3),rep("B",2),rep("C",2)))

new_df <- df %>% group_by(household) %>%
  filter(n_distinct(city) > 1)

Source: local data frame [5 x 2]
Groups: household

      city household
1    Paris         A
2    Paris         A
3     Nice         A
4 Toulouse         C
5    Paris         C

: @shadow @davidarenburg

+2

R

df1[with(df1, ave(as.character(City), Household, FUN=function(x) length(unique(x))) > 1L),]

df1[df1$Household %in% names(which(table(unique(df1)$Household) > 1)),]

data.table v >= 1.9.5

library(data.table) # v > 1.9.5, otherwise use length(unique(City))
setDT(df1)[, if(uniqueN(City) > 1L) .SD, by = Household]

setDT(df1)[, .SD[uniqueN(City) > 1L], by = Household]
+2

All Articles