I am trying to parallelize my code because I am currently using double for loop to write the results. I tried to see how to use SNOW and doParallel packages in R to do this.
If you want to use a replicated example, just use
residual_anomalies <- matrix(sample(c('ANOMALY','NO SIGNAL'),300,replace=T),nrow=100)
instead of these three lines
inputfile <- paste0("simulation_",i,"_",metrics[k],"_US.csv") data <- residuals(inputfile) residual_anomalies <- conceptdrift(data,length=10,threshold=.05)
in a nested loop. All code is below.
source("GetMetrics.R") source("slowdrift_resampling_vectorized.R") metrics <- unique(metrics) num_metrics <- length(metrics) f1_scores_table_raw = data.frame(matrix(ncol=10,nrow=46)) f1_scores_table_pred = data.frame(matrix(ncol=10,nrow=46)) rownames(f1_scores_table_raw) <- metrics colnames(f1_scores_table_raw) <- paste0("Sim",1:10) rownames(f1_scores_table_pred) <- metrics colnames(f1_scores_table_pred) <- paste0("Sim",1:10) for(k in 1:num_metrics){ for(i in 1:10){
I used to use foreach in the outer loop with% dopar%, but the problem I ran into was that I kept getting the problem "% dopar%" was not found. Should I parallelize both loops or only one?
I also know that foreach creates a list and stores it in a variable, but can I still store other data in my foreach loop? For example, I still want to write data to my f1_scores_table_raw and f1_scores_table_pred arrays.
Thanks!