Data.table sum and subset

I have data.table that I want to combine

library(data.table)
dt1 <- data.table(year=c("2001","2001","2001","2002","2002","2002","2002"),
                  group=c("a","a","b","a","a","b","b"), 
                  amt=c(20,40,20,35,30,28,19))

I want to summultiply by year and group, and then filter where the summed character for any given group is more than 100.

I have data. The table amount is nailed.

dt1[, sum(amt),by=list(year,group)]

   year group V1
1: 2001     a 60
2: 2001     b 20
3: 2002     a 65
4: 2002     b 47

I'm having problems with the final filtering level.

The end result I'm looking for is:

   year group V1
1: 2001     a 60
2: 2002     a 65

As a) 60 + 65 > 100thenb) 20 + 47 <= 100

Any thoughts on how to achieve this will be great.

I looked at this data.table by group and returned the row with the maximum value and wondered if this is an equally eloquent solution to my problem.

+4
source share
4 answers

Single liner in data.table:

dt1[, lapply(.SD,sum), by=list(year,group)][, if (sum(amt) > 100) .SD, by=group]

#   group year amt
#1:     a 2001  60
#2:     a 2002  65
+7

:

library(dplyr)
dt1 %>% 
  group_by(group, year) %>% 
  summarise(amt = sum(amt)) %>%
  filter(sum(amt) > 100)

:

#Source: local data table [2 x 3]
#Groups: group
#
#  year group amt
#1 2001     a  60
#2 2002     a  65
+3

. ,

big_groups <- dt1[,sum(amt),by=group][V1>100]$group
dt1[group%in%big_groups,sum(amt),by=list(year,group)]
+2
source

This may not be the idea of ​​a solution, but I would do it in several steps as follows:

dt2=dt1[, sum(amt),by=list(year,group)]
dt3=dt1[, sum(amt)>100,by=list(group)]
dt_result=dt2[group %in% dt3[V1==TRUE]$group,]
+1
source

All Articles