Multiply different subsets of the data frame by different vectors

Question

Multiply different subsets of the data frame by different vectors

I would like to multiply several columns in my data frame by a vector of values. The specific vector of values varies depending on the value in another column.

- EDIT -

What if I make the dataset more complex, i.e. more than 2 conditions, and conditions are randomly shuffled around the data set?

Here is an example of my dataset:

df=data.frame( Treatment=(rep(LETTERS[1:4],each=2)), Species=rep(1:4,each=2), Value1=c(0,0,1,3,4,2,0,0), Value2=c(0,0,3,4,2,1,4,5), Value3=c(0,2,4,5,2,1,4,5), Condition=c("A","B","A","C","B","A","B","C") )

What does it look like:

  Treatment Species Value1 Value2 Value3 Condition A 1 0 0 0 A A 1 0 0 2 BB 2 1 3 4 A B 2 3 4 5 C C 3 4 2 2 B C 3 2 1 1 A D 4 0 4 4 B D 4 0 5 5 C

If Condition=="A" , I would like to multiply columns 3-5 by vector c(1,2,3) . If Condition=="B" , I would like to multiply columns 3-5 by vector c(4,5,6) . If Condition=="C" , I would like to multiply columns 3-5 by vector c(0,1,0) . Thus, the resulting data frame will look like this:

  Treatment Species Value1 Value2 Value3 Condition A 1 0 0 0 A A 1 0 0 12 BB 2 1 6 12 A B 2 0 4 0 C C 3 16 10 12 B C 3 2 2 3 A D 4 0 20 24 B D 4 0 5 0 C

I tried a subset of the data frame and vector multiplication:

 t(t(subset(df[,3:5],df[,6]=="A")) * c(1,2,3))

But I can not return the frame of the subset of data to the original. Is there a way to perform this operation without a subset of the data frame so that other columns are preserved (e.g. Processing, Views)?

+4

vector r subset multiplying

jslefche Jul 29 '11 at 20:50

source share

4 answers

Edited to reflect some notes from comments

Assuming Condition is a factor, you can do this:

 #Modified to reflect OP edit - the same solution works just fine m <- matrix(c(1:6,0,1,0),3,3,byrow = TRUE) df[,3:5] <- with(df,df[,3:5] * m[Condition,])

which uses fairly fast vector multiplication. And, obviously, wrapping this in with not strictly necessary, it is just what came out of my brain. Also pay attention to the comment of the subset below from Backlin.

On a global scale, remember that you can also do all the subsets that you can use with subset with [ and, most importantly, [ assignment support via [<- . Therefore, if you want to change part of a frame or data matrix, you can always use this type of idiom:

 df[rowCondition,colCondition] <- <replacement values>

assuming, of course, that <replacement values> is the same dimension as your df subset. This may work differently, but you will avoid the rules for disposing of R, and R may discard the warning.

+2

joran Jul 29 '11 at 21:07

source share

Here's a non-vectorized but easy to understand solution:

  replaceFunction <- function(v){ m <- as.numeric(v[3:5]) if (v[6]=="A") out <- m * c(1,2,3) else if (v[6]=="B") out <- m * c(4,5,6) else out <- m return(out) } g <- apply(df, 1, replaceFunction) df[3:5] <- t(g) df

+2

Jd long Jul 29 '11 at 10:41

source share

 df[3:5] <- df[3:5] * t(sapply(df$Condition, function(x) if(x=="B") 4:6 else 1:3))

Or by vector multiplication

 df[3:5] <- df[3:5] * (3*(df$Condition == "B") %*% matrix(1, 1, 3) + matrix(1:3, nrow(df), 3, byrow=T))

+1

Backlin Jul 29 '11 at 21:10

source share

Joshua ulrich · Accepted Answer · 2011-07-29T23:07:33+0000

Here is a fairly general solution that you must adapt to suit your needs.

Note that the first argument in the outer call is a logical vector, and the second is a numeric one, so before multiplying, TRUE and FALSE converted to 1 and 0 respectively. We can add outer results because the conditions do not overlap and the FALSE elements will be zero.

 multiples <- outer(df$Condition=="A",c(1,2,3)) + outer(df$Condition=="B",c(4,5,6)) + outer(df$Condition=="C",c(0,1,0)) df[,3:5] <- df[,3:5] * multiples

Multiply different subsets of the data frame by different vectors

More articles: