I am trying to use data.table in R to summarize the following data table:
SiteNo Var1 Var2 Var3 ... Var18 Group 1 0.1 0.3 1 0.3 1 2 0.3 0.1 0.9 0.2 1 etc.
There are 668,944 observations, 43 sites, 3 groups and 19 variables. I would like to get the results of a function (e.g. mean ) that sums each column / variable both by site and by group. Thus, there should be 43 sites x 3 groups x # of total statistics (for example, mean ). I used the following code:
e.dt<-data.table(e) setkey(e.dt, Group) # set key to group number # get mean for each column/variable e.dt.mean<-e.dt[,lapply(.SD,mean), by="SiteNo"]
Using the above, I get 43 sites, but not 3 groups that I was behind. I could split the original data table into three groups, but wondered if there was a way to generalize using two variables (SiteNo and Group) using data.table .
I am still RTM on data.table , but so far I have not found the answer to above.
Prophet60091
source share