How to filter in dplyr based on appropriate condition

Question

How to filter in dplyr based on appropriate condition

I have a data frame. I want to filter out some problems only if they are related to a specific group.

For a dummy example, suppose I have the following:

> mydf
   Group Issue
1      A     G
2      A     H
3      A     L
4      B     V
5      B     M
6      C     G
7      C     H
8      C     L
9      C     X
10     D     G
11     D     H
12     D     I

I want to filter out lines with an error of "G" or "H" or "L" if there is also an "L" problem in this group.

So, in this case, I want to filter lines 1, 2, 3, 6,7,8, but leave lines 4,5,9, 10, 11 and 12. Thus, the result will be:

> mydf
   Group Issue
4      B     V
5      B     M
9      C     X
10     D     G
11     D     H
12     D     I

It seems to me that I need first group_by(Group), but then I wonder what is the best way to do this.

Thank!

+4

r dplyr

user1357015 Jun 26 '15 at 2:11

source share

1 answer

Frank · Accepted Answer · 2015-06-26T02:24:22+0000

If rule

When the group contains L, discard L, G, and H.

then

mydf %>% 
  group_by(Group) %>% 
  filter( if (any(Issue=="L")) !(Issue %in% c("G","H","L")) else TRUE )

#   Group Issue
# 1     B     V
# 2     B     M
# 3     C     X
# 4     D     G
# 5     D     H
# 6     D     I

How to filter in dplyr based on appropriate condition

More articles: