Introduction
I have to say that so much is hidden behind tuning arrays. The rest of the problem is trivial. As a result, there are two ways to do this truly:
- Bruteforce set by @Alex (written in C ++)
- Observation of replication patterns
Bruteforce with OpenMP
If we want to use brute force, then we can use @Alex's suggestion to use OpenMP with Armadillo
#include <RcppArmadillo.h> // [[Rcpp::depends(RcppArmadillo)]] // Add a flag to enable OpenMP at compile time // [[Rcpp::plugins(openmp)]] // Protect against compilers without OpenMP
Replication patterns
However, a more reasonable approach is to understand how array(0:1, dims) .
Most noticeably:
- Case 1: if
xdim even, then only the rows of the matrix change. - Case 2: If
xdim is odd and ydim is odd, then rows alternate, and matrices alternate. - Case 3: If
xdim is odd and ydim is equal, then only alternate strings
Examples
Look at the actions in action to observe the patterns.
Case 1:
xdim <- 2 ydim <- 3 tdim <- 2 a <- array(0:1,dim=c(xdim,ydim,tdim))
Output
, , 1 [,1] [,2] [,3] [1,] 0 0 0 [2,] 1 1 1 , , 2 [,1] [,2] [,3] [1,] 0 0 0 [2,] 1 1 1
Case 2:
xdim <- 3 ydim <- 3 tdim <- 3 a <- array(0:1,dim=c(xdim,ydim,tdim))
Output:
, , 1 [,1] [,2] [,3] [1,] 0 1 0 [2,] 1 0 1 [3,] 0 1 0 , , 2 [,1] [,2] [,3] [1,] 1 0 1 [2,] 0 1 0 [3,] 1 0 1 , , 3 [,1] [,2] [,3] [1,] 0 1 0 [2,] 1 0 1 [3,] 0 1 0
Case 3:
xdim <- 3 ydim <- 4 tdim <- 2 a <- array(0:1,dim=c(xdim,ydim,tdim))
Output:
, , 1 [,1] [,2] [,3] [,4] [1,] 0 1 0 1 [2,] 1 0 1 0 [3,] 0 1 0 1 , , 2 [,1] [,2] [,3] [,4] [1,] 0 1 0 1 [2,] 1 0 1 0 [3,] 0 1 0 1
Hack pattern
Alrighty, based on the discussion above, we decided to do some code to use this unique template.
Create alternating vectors
In this case, the variable vector switches between two different values.
Creating Three Matrix Cases
As mentioned above, there are three cases of a matrix. Even, first odd and second odd cases.
// --- Handle the different cases // [[Rcpp::export]] arma::mat make_even_matrix(unsigned int xdim, unsigned int ydim){ arma::mat temp_mat(xdim,ydim); temp_mat.each_col() = even_vec(xdim); return temp_mat; } // xdim is odd and ydim is even // [[Rcpp::export]] arma::mat make_odd_matrix_case1(unsigned int xdim, unsigned int ydim){ arma::mat temp_mat(xdim,ydim); arma::vec e_vec = even_vec(xdim); arma::vec o_vec = odd_vec(xdim); // Alternating column for (unsigned int i = 0; i < ydim; i++) { temp_mat.col(i) = (i % 2 ? o_vec : e_vec); } return temp_mat; } // xdim is odd and ydim is odd // [[Rcpp::export]] arma::mat make_odd_matrix_case2(unsigned int xdim, unsigned int ydim){ arma::mat temp_mat(xdim,ydim); arma::vec e_vec = even_vec(xdim); arma::vec o_vec = odd_vec(xdim); // Alternating column for (unsigned int i = 0; i < ydim; i++) { temp_mat.col(i) = (i % 2 ? e_vec : o_vec); // slight change } return temp_mat; }
Calculation engine
Same as the previous solution, without t , since we no longer need to repeat the calculations.
// --- Calculation engine // [[Rcpp::export]] arma::mat calc_matrix(arma::mat temp_mat){ unsigned int xdim = temp_mat.n_rows; unsigned int ydim = temp_mat.n_cols; arma::mat res = temp_mat; // Subset the rows for (unsigned int x = 2; x < xdim-2; x++){ arma::mat temp_row_sub = temp_mat.rows(x-2, x+2); // Iterate over the columns with unit accumulative sum for (unsigned int y = 2; y < ydim-2; y++){ res(x,y) = accu(temp_row_sub.cols(y-2,y+2)); } } return res; }
Main call function
Here is the main function that brings everything together. This gives us the required remote arrays.
// --- Main Engine // Create the desired cube information // [[Rcpp::export]] arma::cube dim_to_cube(unsigned int xdim = 4, unsigned int ydim = 4, unsigned int tdim = 3) { // Initialize values in A arma::cube res(xdim,ydim,tdim); if(xdim % 2 == 0){ res.each_slice() = calc_matrix(make_even_matrix(xdim, ydim)); }else{ if(ydim % 2 == 0){ res.each_slice() = calc_matrix(make_odd_matrix_case1(xdim, ydim)); }else{ arma::mat first_odd_mat = calc_matrix(make_odd_matrix_case1(xdim, ydim)); arma::mat sec_odd_mat = calc_matrix(make_odd_matrix_case2(xdim, ydim)); for(unsigned int t = 0; t < tdim; t++){ res.slice(t) = (t % 2 ? sec_odd_mat : first_odd_mat); } } } return res; }
Timing
Now, however, how well this happens:
Unit: microseconds expr min lq mean median uq max neval r_1core 3538.022 3825.8105 4301.84107 3957.3765 4043.0085 16856.865 100 alex_1core 2790.515 2984.7180 3461.11021 3076.9265 3189.7890 15371.406 100 cpp_1core 174.508 180.7190 197.29728 194.1480 204.8875 338.510 100 cpp_2core 111.960 116.0040 126.34508 122.7375 136.2285 162.279 100 cpp_3core 81.619 88.4485 104.54602 94.8735 108.5515 204.979 100 cpp_cache 40.637 44.3440 55.08915 52.1030 60.2290 302.306 100
Script is used for synchronization:
cpp_parallel = cube_parallel(a,res, 1) alex_1core = alex(a,res,xdim,ydim,tdim) cpp_cache = dim_to_cube(xdim,ydim,tdim) op_answer = cube_r(a,res,xdim,ydim,tdim) all.equal(cpp_parallel, op_answer) all.equal(cpp_cache, op_answer) all.equal(alex_1core, op_answer) xdim <- 20 ydim <- 20 tdim <- 5 a <- array(0:1,dim=c(xdim,ydim,tdim)) res <- array(0:1,dim=c(xdim,ydim,tdim)) ga = microbenchmark::microbenchmark(r_1core = cube_r(a,res,xdim,ydim,tdim), alex_1core = alex(a,res,xdim,ydim,tdim), cpp_1core = cube_parallel(a,res, 1), cpp_2core = cube_parallel(a,res, 2), cpp_3core = cube_parallel(a,res, 3), cpp_cache = dim_to_cube(xdim,ydim,tdim))