In your specific case, you can calculate:
out <- rep(group.prevalence, times=last(dim(population))) * rep(population, each=first(dim(group.prevalence)))
and then you can set the dimensions of this array :
array(out, dim=c(2,2,2,3), dimnames=list(group=c("A","B"), age=c("young","old"), gender=c("male","female"), year=c("year1","year2","year3")))
The key is to align the sizes of the two arrays using transpose sizes and expansion / replication to fill in the missing dimensions that are in the other array. In general, the procedure is as follows:
- Define overlapping dimensions. Here
(age,gender) . - For the argument on the left side of the
group.prevalence multiplication group.prevalence move the sizes (using aperm ) so that all disjoint sizes (i.e. group ) are the first. Then repeat this array N times (using times ), where N is the size of the disjoint dimensions (i.e. year ) of the right side argument, population . - For the argument on the right side of the
population multiplication, move the sizes so that all disjoint sizes (i.e. year ) are the last. Then replicate each element of the array M times (using each ), where M is the size of the disjoint dimensions (i.e. group ) of the left side argument, group.prevalence . - Then just (an array) is multiplied, which is vectorized and fast.
- Compatible result sizes are simply the disjoint sizes of the left side argument, followed by the intersecting sizes, followed by the disjoint sizes of the right side (i.e.
(group, age, gender, year) ). You can then transfer these measurements as needed at the output to get what you want.
How to check:
# bad solution grouped.population <- array(NA, dim=c(2,2,2,3), dimnames=list(group=c("A","B"), age=c("young","old"), gender=c("male","female"), year=c("year1","year2","year3"))) for (group in c("A","B")) for(gender in c("male","female")) for (age in c("young","old")) grouped.population[group,age,gender,] <- group.prevalence[group,age,gender] * population[age,gender,] # another approach grouped.population2 <- array(rep(group.prevalence, times=last(dim(population))) * rep(population, each=first(dim(group.prevalence))), dim=c(2,2,2,3), dimnames=list(group=c("A","B"), age=c("young","old"), gender=c("male","female"), year=c("year1","year2","year3"))) # check all.equal(grouped.population,grouped.population2) ##[1] TRUE
Updated with a test:
library(microbenchmark) f1 <- function(group.prevalence, population) { grouped.population <- array(NA, dim=c(2,2,2,3), dimnames=list(group=c("A","B"), age=c("young","old"), gender=c("male","female"), year=c("year1","year2","year3"))) for (group in c("A","B")) { for(gender in c("male","female")) { for (age in c("young","old")) { grouped.population[group,age,gender,] <- group.prevalence[group,age,gender] * population[age,gender,]}}} } f2 <- function(group.prevalence, population) { grouped.population2 <- array(rep(group.prevalence, times=last(dim(population))) * rep(population, each=first(dim(group.prevalence))), dim=c(2,2,2,3), dimnames=list(group=c("A","B"), age=c("young","old"), gender=c("male","female"), year=c("year1","year2","year3"))) } print(microbenchmark(f1(group.prevalence, population))) ##Unit: microseconds ## expr min lq mean median uq max neval ## f1(group.prevalence, population) 101.473 103.998 149.2562 106.8865 115.372 1185.32 100 print(microbenchmark(f2(group.prevalence, population))) ##Unit: microseconds ## expr min lq mean median uq max neval ## f2(group.prevalence, population) 66.392 67.672 70.19873 68.454 69.4205 173.284 100
I believe that performance will diverge even more as the number of dimensions and size in each dimension increase.