OpenMP parallelization by recursive function

Question

OpenMP parallelization by recursive function

I am trying to use parallelization to increase the refresh rate for drawing a 3D scene with hierarchically ordered objects. The scene drawing algorithm first recursively traverses the tree of objects, and from there it builds an ordered array of important data needed to draw the scene. Then it traverses this array several times to draw objects / overlays, etc. Since from what I read, OpenGL is not a thread safe API, I assume that the traversal / drawing code of the array should be executed in the main thread, but I think I can parallelize the recursive function filling the array. The key is that the array must be filled in the order in which the objects meet in the scene, so all the functionality that associates the given object with the array index must be executed in the correct order, but after the array index has been assigned, I can fill data from this array element (which is not necessarily a trivial operation) using workflows. So here is the pseudo code I'm trying to get to. I hope you understand the syntax of the xml-ish stream.

recursivepopulatearray(theobject) { <main thread> for each child of theobject { assign array index <child thread(s)> populate array element for child object </child thread(s)> recursivepopulatearray(childobject) } </main thread> }

So, can this be done using OpenMP, and if so, how? Are there other parallelization libraries that can do this better?

Addendum: In response to a Davide request for more explanation , let me explain a little more. Say the scene is organized like this:

 -Bicycle Frame - Handle Bars - Front Wheel - Back Wheel -Car Frame - Front Left Wheel - Front Right Wheel - Back Left Wheel - Back Right Wheel

Now each of these objects has a lot of data associated with it, that is, location, rotation, size, various drawing parameters, etc. In addition, I need to make several passes over this scene in order to draw correctly. One passage draws shapes of objects, another passage draws text describing objects, another passage draws connections / associations between objects, if any. In any case, getting all the drawing data from these different objects is pretty slow if I have to access it several times, so I decided to use one pass to cache all this data in a one-dimensional array, and then all the actual drawing passes just look at an array. The trick is that since I need to do OpenGL, it spreads / pops up in the correct order, the array should be in the correct depth search order, which is representative of the tree hierarchy. In the above example, the array should be ordered as follows:

 index 0: Bicycle Frame index 1: Handle Bars index 2: Front Wheel index 3: Back Wheel index 4: Car Frame index 5: Front Left Wheel index 6: Front Right Wheel index 7: Back Left Wheel index 8: Back Right Wheel

So, the ordering of the array must be correctly serialized, but as soon as I correctly assigned this ordering, I can parallelize the filling of the array. For example, once I have assigned a Bicycle Frame for indexing 0 and Handle Bars for index 1, one thread can accept an array element fill for a Bicycle Frame, while another accepts an array element fill for Handle Bars.

Well, I think that, clarifying this, I answered my question, so I thanked David. So I posted my own answer .

+6

c ++ openmp drawing

Anthony johnson May 07, '09 at 17:06

source share

4 answers

I think you should better clarify your question (for example, what exactly needs to be done in series and why)

OpenMP (like many other parallelization libraries) does not guarantee the order in which different parallel sections will be executed, and since they are really parallel (on a multi-core machine), there may be race conditions if different sections write the same data. If this is normal for your problem, you can use it.

+1

Davide May 07, '09 at 17:24

source share

As gbjbaanb mentioned , you can do it easily - it just requires a pragma statement.

However, there are a few things to keep in mind:

Firstly, you mentioned that the order here is crutial. If you need to maintain ordering while aligning the hierarchical structure, parallelization (at this level) will be problematic. You are likely to lose your order completely.

In addition, when parallelizing recursive functions, many problems arise. Take an extreme case - let's say you have a dual-core machine, and you have a tree where each “parent” node has 4 children. If the tree is deep, you very, very quickly “redistribute” the problem, as a rule, do something worse and not better, performance is wise.

If you are going to do this, you should probably set the level parameter and only parallelize the first couple of levels. Take my example with 4 parents per parent, if you parallelize the first 2 levels, you already break it into 16 parallel pieces (called 4 parallel blocks).

From what you mentioned, I would leave this series of the series and concentrate instead of the second where you mention:

"Then it goes through this array several times to draw objects / overlays, etc."

It sounds like the perfect place to parallelize.

+1

Reed copsey May 07, '09 at 17:30

source share

to parallelize the child thread, just put the pragma before the loop:

 #pragma omp parallel for for (i=0; i < elements; i++) { }

The task is completed.

Now you are absolutely right, you cannot get any library of threads to make one bit before the other in a completely parallel way (obviously!), And openMP does not have a “lock” or “wait” function (it does have the keyword “wait for all to finish "- Barrier), which is not intended to emulate a stream library, but it allows you to store the values" outside "of the parallel section and mark certain sections as" only single-threaded "(hardened keyword), so this can help you assign indexes to parallel loop in while other threads assign items.

See the getting started guide .

If you use Visual C ++, you also need to set the / omp flag in the compiler build settings.

0

gbjbaanb May 07, '09 at 17:17

source share

Anthony johnson · Accepted Answer · 2009-05-07T19:49:06+0000

Here is a modified piece of pseudocode that should work.

 populatearray(thescene) { recursivepopulatearray(thescene) #pragma omp parallel for for each element in array populate array element based on associated object } recursivepopulatearray(theobject) { for each childobject in theobject { assign array index and associate element with childobject recursivepopulatearray(childobject) } }

OpenMP parallelization by recursive function

More articles: