Apply a function to each row of the matrix or to a data frame

Suppose I have an n by 2 matrix and a function that takes a 2-vector as one of its arguments. I would like to apply a function to each row of the matrix and get an n-vector. How to do it in R?

For example, I would like to calculate the density of a two-dimensional standard normal distribution over three points:

bivariate.density(x = c(0, 0), mu = c(0, 0), sigma = c(1, 1), rho = 0){ exp(-1/(2*(1-rho^2))*(x[1]^2/sigma[1]^2+x[2]^2/sigma[2]^2-2*rho*x[1]*x[2]/(sigma[1]*sigma[2]))) * 1/(2*pi*sigma[1]*sigma[2]*sqrt(1-rho^2)) } out <- rbind(c(1, 2), c(3, 4), c(5, 6)) 

How to apply a function to each out line?

How do I pass values ​​for arguments other than dots to a function, as you specify?

+80
function matrix r apply sapply
Nov 21 '10 at 3:59
source share
6 answers

You just use the apply() function:

 R> M <- matrix(1:6, nrow=3, byrow=TRUE) R> M [,1] [,2] [1,] 1 2 [2,] 3 4 [3,] 5 6 R> apply(M, 1, function(x) 2*x[1]+x[2]) [1] 4 10 16 R> 

This takes a matrix and applies a (dumb) function to each row. You pass additional arguments to the function as the fourth, fifth, ... arguments in apply() .

+129
Nov 21 '10 at 4:05
source share

If you want to apply general functions like sum or mean, you should use rowSums or rowMeans , since they are faster than apply(data, 1, sum) . Otherwise, stick to apply(data, 1, fun) . You can pass additional arguments after the FUN argument (as Dirk already pointed out):

 set.seed(1) m <- matrix(round(runif(20, 1, 5)), ncol=4) diag(m) <- NA m [,1] [,2] [,3] [,4] [1,] NA 5 2 3 [2,] 2 NA 2 4 [3,] 3 4 NA 5 [4,] 5 4 3 NA [5,] 2 1 4 4 

Then you can do something like this:

 apply(m, 1, quantile, probs=c(.25,.5, .75), na.rm=TRUE) [,1] [,2] [,3] [,4] [,5] 25% 2.5 2 3.5 3.5 1.75 50% 3.0 2 4.0 4.0 3.00 75% 4.0 3 4.5 4.5 4.00 
+14
Nov 21 '10 at 18:05
source share

Here is a brief example of applying a function to each row of a matrix. (Here the function used normalizes each line to 1.)

Note. The result of apply() must be transposed using t() to get the same layout as the input matrix A

 A <- matrix(c( 0, 1, 1, 2, 0, 0, 1, 3, 0, 0, 1, 3 ), nrow = 3, byrow = TRUE) t(apply(A, 1, function(x) x / sum(x) )) 

Result:

  [,1] [,2] [,3] [,4] [1,] 0 0.25 0.25 0.50 [2,] 0 0.00 0.25 0.75 [3,] 0 0.00 0.25 0.75 
+10
Nov 04 '14 at 12:37
source share

The first step is to create a function object and then use it. If you want the matrix object to have the same number of rows, you can predefine it and use the object [] form, as shown (otherwise the return value will be simplified for the vector):

 bvnormdens <- function(x=c(0,0),mu=c(0,0), sigma=c(1,1), rho=0){ exp(-1/(2*(1-rho^2))*(x[1]^2/sigma[1]^2+ x[2]^2/sigma[2]^2- 2*rho*x[1]*x[2]/(sigma[1]*sigma[2]))) * 1/(2*pi*sigma[1]*sigma[2]*sqrt(1-rho^2)) } out=rbind(c(1,2),c(3,4),c(5,6)); bvout<-matrix(NA, ncol=1, nrow=3) bvout[] <-apply(out, 1, bvnormdens) bvout [,1] [1,] 1.306423e-02 [2,] 5.931153e-07 [3,] 9.033134e-15 

If you want to use parameters other than the default parameters, then the call must include named arguments after the function:

 bvout[] <-apply(out, 1, FUN=bvnormdens, mu=c(-1,1), rho=0.6) 

apply () can also be used for arrays with a higher dimension, and the argument MARGIN can be either a vector or a single integer.

+6
Nov 42-21. '10 at 15:01
source share

Another approach, if you want to use the variable part of the data set instead of a single value, is to use rollapply(data, width, FUN, ...) . Using a width vector allows you to apply a function in a changing dataset window. I used this to create an adaptive filtering procedure, although this is not very efficient.

+2
Sep 21 '11 at 16:29
source share

To apply the work is good, but rather slowly. Using sapply and vapply may be helpful. dplyr rowwise can also be useful. Let's look at an example of how to make a series of wise products of any data frame.

 a = data.frame(t(iris[1:10,1:3])) vapply(a, prod, 0) sapply(a, prod) 

Note that assigning a variable before using vapply / sapply / apply is good practice, as it greatly reduces time. See the results of microdetection.

 a = data.frame(t(iris[1:10,1:3])) b = iris[1:10,1:3] microbenchmark::microbenchmark( apply(b, 1 , prod), vapply(a, prod, 0), sapply(a, prod) , apply(iris[1:10,1:3], 1 , prod), vapply(data.frame(t(iris[1:10,1:3])), prod, 0), sapply(data.frame(t(iris[1:10,1:3])), prod) , b %>% rowwise() %>% summarise(p = prod(Sepal.Length,Sepal.Width,Petal.Length)) ) 

Use caution when using t ()

+2
May 29 '17 at 3:32 p.m.
source share



All Articles