Multithreading for loop while maintaining order

I started messing around with a multi-threaded process for an intensive batch process with a processor that I am running. Essentially, I am trying to condense multiple one-page pages into separate PDF documents. This works fine with a foreach loop or standard iteration, but can be very slow for several 100-page documents. I tried the following based on some examples that I found to use multithreading, and it has significant performance improvements, however it erases the page order instead of 1,2,3,4, it will be 1,3,4,2,6,5 on which thread ends first.

My question is how to use this technique while maintaining page order, and if possible, can it adversely affect multithreading performance? Thank you in advance.

PdfDocument doc = new PdfDocument(); string mail = textBox1.Text; string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None); int counter = split.Count(); // Source must be array or IList. var source = Enumerable.Range(0, 100000).ToArray(); // Partition the entire source array. var rangePartitioner = Partitioner.Create(0, counter); double[] results = new double[counter]; // Loop over the partitions in parallel. Parallel.ForEach(rangePartitioner, (range, loopState) => { // Loop over each range element without a delegate invocation. for (int i = range.Item1; i < range.Item2; i++) { f_prime = split[i].Replace(" " , ""); PdfPage page = doc.AddPage(); XGraphics gfx = XGraphics.FromPdfPage(page); XImage image = XImage.FromFile(f_prime); double x = 0; gfx.DrawImage(image, x, 0); } }); 
+6
multithreading c # parallel-processing
source share
3 answers

I’m not sure that other solutions will work exactly as he wants. The reason for this is that PdfPage page = doc.AddPage(); creates and adds a new page at the same time, so it will always be out of order, since the request is dictated first, first using doc

If AddPage is a fast operation, you can create all 100 pages at once without any processing. Then go back and draw the Tiff images on the page.

 PdfDocument doc = new PdfDocument(); string mail = textBox1.Text; string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None); int counter = split.Count(); // Source must be array or IList. var source = Enumerable.Range(0, 100000).ToArray(); // Partition the entire source array. var rangePartitioner = Partitioner.Create(0, counter); double[] results = new double[counter]; PdfPage[] pages = new PdfPage[counter]; for (int i = 0; i < counter; ++i) { pages[i] = doc.AddPage(); } // Loop over the partitions in parallel. Parallel.ForEach(rangePartitioner, (range, loopState) => { // Loop over each range element without a delegate invocation. for (int i = range.Item1; i < range.Item2; i++) { f_prime = split[i].Replace(" " , ""); PdfPage page = pages[i]; XGraphics gfx = XGraphics.FromPdfPage(page); XImage image = XImage.FromFile(f_prime); double x = 0; gfx.DrawImage(image, x, 0); } }); 

Edit

I think there is a more elegant solution, but without knowing the properties of PdfPage, I did not want to offer it earlier. If you can determine which page PfdPage belongs to, you can do it very simply:

 PdfDocument doc = new PdfDocument(); string mail = textBox1.Text; string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None); int counter = split.Count(); // Source must be array or IList. var source = Enumerable.Range(0, 100000).ToArray(); // Partition the entire source array. var rangePartitioner = Partitioner.Create(0, counter); double[] results = new double[counter]; // Loop over the partitions in parallel. Parallel.ForEach(rangePartitioner, (range, loopState) => { // Loop over each range element without a delegate invocation. for (int i = range.Item1; i < range.Item2; i++) { PdfPage page = doc.AddPage(); // Only use i as a loop not as the index int pageIndex = page.PageIndex; // This is what I don't know f_prime = split[pageIndex].Replace(" " , ""); XGraphics gfx = XGraphics.FromPdfPage(page); XImage image = XImage.FromFile(f_prime); double x = 0; gfx.DrawImage(image, x, 0); } }); 
+2
source share

I would just use Parallel.ForEach overload, which returns the index of the element:

  Parallel.ForEach(rangePartitioner, (range, loopState, elementIndex) => 

then in your loop you can fill the array with the result of your work and execute the results in the order after all of them are completed.

+3
source share

Use .AsParallel (). AsOrdered (), as described in this document: http://msdn.microsoft.com/en-us/library/dd460677.aspx

I think it will look something like this:

 rangePartitioner.AsParallel().AsOrdered().ForAll( range => { // Loop over each range element without a delegate invocation. ... }); 
+2
source share

All Articles