MPI-matrix-vector-multiplication returns sometimes correct sometimes weird values

I have the following code:

//Start MPI... MPI_Init(&argc, &argv); int size = atoi(argv[1]); int delta = 10; int rnk; int p; int root = 0; MPI_Status mystatus; MPI_Comm_rank(MPI_COMM_WORLD, &rnk); MPI_Comm_size(MPI_COMM_WORLD, &p); //Checking compatibility of size and number of processors assert(size % p == 0); //Initialize vector... double *vector = NULL; vector = malloc(size*sizeof(double)); double *matrix = NULL; //Rank 0 ----------------------------------- if (rnk == 0) { //Initialize vector... srand(1); for (int i = 0; i < size; i++) { vector[i] = rand() % delta + 1; } printf("Initial vector:"); print_vector(vector, size); //Initialize matrix... matrix = malloc(size*size*sizeof(double)); srand(2); for (int i = 0; i < (size*size); i++) { matrix[i] = rand() % delta + 1; } //Print matrix... printf("Initial matrix:"); print_flat_matrix(matrix, size); } //Calculating chunk_size... int chunk_size = size/p; //Initialize submatrix.. double *submatrix = malloc(size*chunk_size*sizeof(double)); //Initialize result vector... double *result = malloc(chunk_size*sizeof(double)); //Broadcasting vector... MPI_Bcast(vector, size, MPI_DOUBLE, root, MPI_COMM_WORLD); MPI_Barrier(MPI_COMM_WORLD); //Scattering matrix... MPI_Scatter(matrix, (size*chunk_size), MPI_DOUBLE, submatrix, (size*chunk_size), MPI_DOUBLE, root, MPI_COMM_WORLD); MPI_Barrier(MPI_COMM_WORLD); printf("I am rank %d and first element of my vector is: %f and of my matrix1: %f/matrix2: %f/matrix3: %f/matrix4: %f\n", rnk, vector[0], submatrix[0], submatrix[1], submatrix[2], submatrix[3]); //Calculating... for (int i = 0; i < chunk_size; i++) { for (int j = 0; j < size; j++) { result[i] += (submatrix[(i*size)+j] * vector[j]); printf("Rank %d; current result: %f, ", rnk, result[i]); } printf("\n"); printf("Rank %d; result: %f...\n", rnk, result[i]); } printf("Rank: %d; first result: %f\n", rnk, result[0]); double *final_result = NULL; //Rank 0 ----------------------------------- if (rnk == 0) { final_result = malloc(size*sizeof(double)); } //Gather... MPI_Gather(result, chunk_size, MPI_DOUBLE, final_result, chunk_size, MPI_DOUBLE, root, MPI_COMM_WORLD); //Rank 0 ----------------------------------- if (rnk == 0) { printf("Final result:\n"); print_vector(final_result, size); free(matrix); free(final_result); } free(submatrix); free(result); free(vector); MPI_Finalize(); 

When I run the program, it ends without errors, but the values ​​that I print at the end are not always correct. Sometimes I get a vector with the correct output, sometimes it partially corrects and sometimes completely wrong. Incorrect values ​​are either incorrect, or exactly equal to 2, or represent a very long useless sequence of numbers (which, it seems to me, should have incorrect access to memory, but I can not find anything, and it's also strange, because sometimes it works).

I also always choose my size so that it matches the number of processes created using mpi. mpi creates 4 processes on my machine (checked and verified value), so for testing my algorithm I always select 4 as the value for the size. The same problem occurs with large sizes.

We look forward to your help and materials guys, thanks in advance!

PS: I am in C

+3
source share
1 answer

Do you know valgrind? He will immediately draw your attention to the problem line.

Your problem looks like this:

 result[i] += (submatrix[(i*size)+j] * vector[j]); 

What was the result of [] initially? It was taken off the heap. Sometimes, if you're lucky, it will be zero. Do not count on luck with C.

There are many ways to initialize an array. Here are a few approaches listed in order that are most likely to be optimized:

Select result [] using calloc:

 double *result = calloc(chunk_size , sizeof(double)); 

Or initialize the array with memset:

 double *result = malloc(chunk_size *sizeof(double)); memset(result, 0, chunk_size *sizeof(double)); 

or, one could iterate over the array

 for (i=0; i < chunk_size; i++) result[i] = 0.0 
+4
source

Source: https://habr.com/ru/post/1214752/


All Articles