Parallel output using MPI IO in a single file

I have a very simple task, but for some reason I was still stuck.

I have one BIG data file ("File_initial.dat") that must be read by all nodes in the cluster (using MPI), each node will perform some manipulations with part of this BIG file (File_size / number_of_nodes) and, finally, each node will write its result to one common BIG file ("File_final.dat"). The number of file elements remains unchanged.

  • From googling, I realized that it is much better to write the data file as a binary file (I only have decimal numbers in this file), and not as a * .txt file. Because no one will read this file, but only computers.

  • I tried to implement myself (but using formatted I / O and NOT a binary file), but I get the wrong behavior.

My code still follows:

#include <fstream> #define NNN 30 int main(int argc, char **argv) { ifstream fin; // setting MPI environment int rank, nprocs; MPI_File file; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &rank); // reading the initial file fin.open("initial.txt"); for (int i=0;i<NNN;i++) { fin >> res[i]; cout << res[i] << endl; // to see, what I have in the file } fin.close(); // starting position in the "res" array as a function of "rank" of process int Pstart = (NNN / nprocs) * rank ; // specifying Offset for writing to file MPI_Offset offset = sizeof(double)*rank; MPI_File file; MPI_Status status; // opening one shared file MPI_File_open(MPI_COMM_WORLD, "final.txt", MPI_MODE_CREATE|MPI_MODE_WRONLY, MPI_INFO_NULL, &file); // setting local for each node array double * localArray; localArray = new double [NNN/nprocs]; // Performing some basic manipulation (squaring each element of array) for (int i=0;i<(NNN / nprocs);i++) { localArray[i] = res[Pstart+i]*res[Pstart+i]; } // Writing the result of each local array to the shared final file: MPI_File_seek(file, offset, MPI_SEEK_SET); MPI_File_write(file, localArray, sizeof(double), MPI_DOUBLE, &status); MPI_File_close(&file); MPI_Finalize(); return 0; } 

I understand that I am doing something wrong, trying to write double as a text file.

How to change code to save

  • as a .txt file (output format)
  • as a .dat file (binary file)
+4
source share
1 answer

The output of the binary is almost right; but your calculations for your file offset and the amount of data to write is incorrect. You want your offset to be

 MPI_Offset offset = sizeof(double)*Pstart; 

not

 MPI_Offset offset = sizeof(double)*rank; 

otherwise, each rank will overwrite each other's data as (say) rank 3 of nprocs=5 starts to write in double number 3 in the file, and not (30/5) * 3 = 18.

Also, you want each bit to write NNN/nprocs doubles, not sizeof(double) doubles, which means you want

 MPI_File_write(file, localArray, NNN/nprocs, MPI_DOUBLE, &status); 

How to write a text file is a much more serious problem; you need to convert the data to a string inside and then print these lines, making sure you know how many characters each line requires careful formatting. This is described in this answer on this site.

+4
source

All Articles