Calculate column values ​​based on values ​​in another column

Possible duplicate:
R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. vs. aggregate vs.

I am using R and will love some help with the problem I am facing:

I have a dataframe ( df ) with a column id and an Emotion column. Each value in the ID corresponds to 40-300 values ​​in Emotion (therefore, this is not a given number). I need to calculate the average of all i in Emotion for every j from ID . So the data looks like

 df$ID = (1, 1, 1, 1, 2, 2, 3) df$Emotion = (2, 4, 6, 4, 1, 1, 8) 

therefore, the vector of means should look like this: (4, 1, 8)

Any help would be greatly appreciated!

+7
source share
2 answers

You can use aggregate

 ID = c(1, 1, 1, 1, 2, 2, 3) Emotion = c(2, 4, 6, 4, 1, 1, 8) df <- data.frame(ID, Emotion) aggregate(.~ID, data=df, mean) ID Emotion 1 1 4 2 2 1 3 3 8 

sapply can also be useful (this other solution will give you a vector)

 sapply(split(df$Emotion, df$ID), mean) 1 2 3 4 1 8 

There are many ways to do this, including ddply from the ddply package, data.table package, other split and lapply , dcast from reshape2 package. See this question for further solutions.

+16
source

This is definitely a tapply .

 tapply(df$ID , df$Emotion, mean) 
+9
source

All Articles