My C ++ functions with Rcpp :: List inputs are very slow

While C ++ and especially the Rcpp package were extremely useful for me in speeding up my codes, I noticed that my C ++ are functions that have an input argument to a list or data (arguments of the form Rcpp :: DataFrame and Rcpp :: List) is very slow compared to my other C ++ functions. I wrote a sample code, and I wanted to ask for tricks that could make my code faster:

First, let it simulate a list in R containing two lists inside it. Consider myList as a list that includes two lists - measure1 and measure2. measure1 and measure2 are lists themselves, including dimension vectors for subjects. Here is the R code:

lappend <- function(lst, ...){ lst <- c(lst, list(...)) return(lst) } nSub <- 30 meas1 <- list() meas2 <- list() for (i in 1:nSub){ meas1 <- lappend(meas1, rnorm(10)) meas2 <- lappend(meas2, rnorm(10)) } myList <- list(meas1 = meas1, meas2 = meas2) 

Now suppose I want the C ++ function for each object to find the summation of measure 1 and the summation of measure 2, and then create two new dimensions based on these two summations. Finally, the function should return these new dimensions as a list.

 // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> #include <Rcpp.h> // [[Rcpp::export]] Rcpp::List mySlowListFn(Rcpp::List myList, int nSub){ arma::vec myMult(nSub); arma::vec myDiv(nSub); for (int i = 0; i < nSub; i++){ arma::vec meas1_i = Rcpp::as<arma::vec>(Rcpp::as<Rcpp::List>(myList["meas1"])[i]); arma::vec meas2_i = Rcpp::as<arma::vec>(Rcpp::as<Rcpp::List>(myList["meas2"])[i]); myMult[i] = arma::sum(meas1_i)*arma::sum(meas2_i); myDiv[i] = arma::sum(meas1_i)/arma::sum(meas2_i); } return Rcpp::List::create(Rcpp::Named("myMult") = myMult, Rcpp::Named("myDiv") = myDiv); } 

How to make the function higher faster? I am especially looking for ideas that contain input and output lists in code (since it is inevitable in my own program related to lists), but with some tricks to reduce some overhead. One thing I was thinking about was:

  Rcpp::List mySlowListFn(const Rcpp::List& myList, int nSub) 

Many thanks for your help.

+6
source share
1 answer

First, note that copying semantics for lists has changed in recent versions of R (specifically in the latest R-devel, not sure if it did it in R 3.1.0), resulting in small copies of lists and elements inside later will be copied if they are changed. There is a good chance that if you use an older version of R, then its more expensive instance copying semantics get in the way.

However, this is how I will rewrite your function for some extra speed using the reference. sourceCpp compare it on your own machine.

 // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> #include <Rcpp.h> // [[Rcpp::export]] Rcpp::List mySlowListFn(Rcpp::List myList, int nSub){ arma::vec myMult(nSub); arma::vec myDiv(nSub); for (int i = 0; i < nSub; i++){ arma::vec meas1_i = Rcpp::as<arma::vec>(Rcpp::as<Rcpp::List>(myList["meas1"])[i]); arma::vec meas2_i = Rcpp::as<arma::vec>(Rcpp::as<Rcpp::List>(myList["meas2"])[i]); myMult[i] = arma::sum(meas1_i)*arma::sum(meas2_i); myDiv[i] = arma::sum(meas1_i)/arma::sum(meas2_i); } return Rcpp::List::create(Rcpp::Named("myMult") = myMult, Rcpp::Named("myDiv") = myDiv); } // [[Rcpp::export]] Rcpp::List myFasterListFn(Rcpp::List myList, int nSub) { Rcpp::NumericVector myMult = Rcpp::no_init(nSub); Rcpp::NumericVector myDiv = Rcpp::no_init(nSub); Rcpp::List meas1 = myList["meas1"]; Rcpp::List meas2 = myList["meas2"]; for (int i = 0; i < nSub; i++) { arma::vec meas1_i( REAL(VECTOR_ELT(meas1, i)), Rf_length(VECTOR_ELT(meas1, i)), false, true ); arma::vec meas2_i( REAL(VECTOR_ELT(meas2, i)), Rf_length(VECTOR_ELT(meas2, i)), false, true ); myMult[i] = arma::sum(meas1_i) * arma::sum(meas2_i); myDiv[i] = arma::sum(meas1_i) / arma::sum(meas2_i); } return Rcpp::List::create( Rcpp::Named("myMult") = myMult, Rcpp::Named("myDiv") = myDiv ); } /*** R library(microbenchmark) lappend <- function(lst, ...){ lst <- c(lst, list(...)) return(lst) } nSub <- 30 n <- 10 meas1 <- list() meas2 <- list() for (i in 1:nSub){ meas1 <- lappend(meas1, rnorm(n)) meas2 <- lappend(meas2, rnorm(n)) } myList <- list(meas1 = meas1, meas2 = meas2) x1 <- mySlowListFn(myList, nSub) x2 <- myFasterListFn(myList, nSub) microbenchmark( mySlowListFn(myList, nSub), myFasterListFn(myList, nSub) ) */ 

gives me

 > library(microbenchmark) > lappend <- function(lst, ...){ + lst <- c(lst, list(...)) + return(lst) + } > nSub <- 30 > n <- 10 > meas1 <- list() > meas2 <- list() > for (i in 1:nSub){ + meas1 <- lappend(meas1, rnorm(n)) + meas2 <- lappend(meas2, rnorm(n)) + } > myList <- list(meas1 = meas1, meas2 = meas2) > x1 <- mySlowListFn(myList, nSub) > x2 <- myFasterListFn(myList, nSub) > microbenchmark( + mySlowListFn(myList, nSub), + myFasterListFn(myList, nSub) + ) Unit: microseconds expr min lq median uq max neval mySlowListFn(myList, nSub) 14.772 15.4570 16.0715 16.7520 42.628 100 myFasterListFn(myList, nSub) 4.502 5.0675 5.2470 5.8515 18.561 100 

Future versions of Rcpp and Rcpp11 will have the ListOf<T> class, which will greatly facilitate the interaction with lists, where we know the internal type in advance, after the correct semantics have been smoothed out.

+4
source

All Articles