RowMeans function in dplyr

Question

RowMeans function in dplyr

I try to run rowMeans calculation inside the dplyr mutate function, but keep getting errors. The following is an example of a data set and the desired result.

 DATA = data.frame(SITE = c("A","A","A","A","B","B","B","C","C"), DATE = c("1","1","2","2","3","3","3","4","4"), STUFF = c(1, 2, 30, 40, 100, 200, 300, 5000, 6000), STUFF2 = c(2, 4, 60, 80, 200, 400, 600, 10000, 12000)) RESULT = data.frame(SITE = c("A","A","A","A","B","B","B","C","C"), DATE = c("1","1","2","2","3","3","3","4","4"), STUFF = c(1, 2, 30, 40, 100, 200, 300, 5000, 6000), STUFF2 = c(2, 4, 60, 80, 200, 400, 600, 10000, 12000), NAYSA = c(1.5, 3, 45, 60, 150, 300, 450, 7500, 9000))

The code I wrote starts with random sampling of STUFF and STUFF2 . Then I would like to compute rowMeans from STUFF and STUFF2 and export the result to a new column. I could accomplish this task with tidyr , but I would have to redo a larger number of variables. In addition, I could use the basic R package, but prefer to find a solution using the mutate function in dplyr . Thanks in advance.

 RESULT = group_by(DATA, SITE, DATE) %>% mutate(STUFF=sample(STUFF,replace= TRUE), STUFF2 = sample(STUFF2,replace= TRUE))%>% # These approaches return errors mutate(NAYSA = rowMeans(DATA[,-1:-2])) mutate(NAYSA = rowMeans(.[,-1:-2])) mutate (NAYSE = rowMeans(.))

+5

r dplyr

Vesuccio Mar 16 '15 at 17:51

source share

3 answers

For this you need the rowwise function in dplyr . Your data is random (due to the sample), so it gives different results, but you will see that it works:

 library(dplyr) group_by(DATA, SITE, DATE) %>% mutate(STUFF=sample(STUFF,replace= TRUE), STUFF2 = sample(STUFF2,replace= TRUE))%>% rowwise() %>% mutate(NAYSA = mean(c(STUFF,STUFF2)))

Conclusion:

 Source: local data frame [9 x 5] Groups: <by row> SITE DATE STUFF STUFF2 NAYSA 1 A 1 1 2 1.5 2 A 1 2 2 2.0 3 A 2 30 80 55.0 4 A 2 30 60 45.0 5 B 3 200 600 400.0 6 B 3 300 200 250.0 7 B 3 100 600 350.0 8 C 4 5000 12000 8500.0 9 C 4 6000 10000 8000.0

As you can see, it calculates the average value for each row, according to STUFF and STUFF2

+5

LyzandeR Mar 16 '15 at 18:08

source share

The rowMeans function needs at least two dimensions, but DATA[,-1:-3] is just one row.

 [1] 2 4 60 80 200 400 600 10000 12000

You can get the result below code

 DATA%>% group_by(SITE, DATE) %>% ungroup() %>% mutate(NAYSA = rowMeans(.[,3:4])) SITE DATE STUFF STUFF2 NAYSA 1 A 1 1 2 1.5 2 A 1 2 4 3.0 3 A 2 30 60 45.0 4 A 2 40 80 60.0 5 B 3 100 200 150.0 6 B 3 200 400 300.0 7 B 3 300 600 450.0 8 C 4 5000 10000 7500.0 9 C 4 6000 12000 9000.0

0

nathanlim45 Mar 16 '15 at 19:23

source share

Vesuccio · Accepted Answer · 2015-03-16T18:51:10+0000

@GregF Yep .... ungroup() was the key. Thank you

Working code

 RESULT = group_by(DATA, SITE, DATE) %>% mutate(STUFF = sample(STUFF,replace= TRUE), STUFF2 = sample(STUFF2,replace= TRUE)) %>% ungroup() %>% mutate(NAYSA = rowMeans(.[,-1:-2]))

RowMeans function in dplyr

More articles: