View Source

h1. Five Tips, Tricks and Idioms in Using the OpenMP 3.0 Tasking Feature
_by Yuan Lin, Sun Microsystems OpenMP team_
h2. 1. Use {{single}} to start parallel region with a root task

To start a parallel region that has one initial root task, use the following idiom,
{code}
#pragma omp parallel
{
#pragma omp single nowait
{
/* this is the initial root task */

#pragma omp task
{
/* this is first child task */
}

#pragma omp task
{
/* this is second child task */
}
}
}

{code}

h2. 2. Avoiding creating extra tasks by noticing a child task is executed concurrently with its parent

Given the following piece of code and assume that {{foo()}} is called within an OpenMP task. Now you want make {{A()}} and {{B()}} execute concurrently by creating more tasks, how would you do that?
{code}
void foo ()
{
A();
B();
}
{code}
You could do the following,
{code}
void foo ()
{
#pragma omp task
A();

#pragma omp task
B();
}
{code}
But the following is often more efficient,
{code}
void foo ()
{
#pragma omp task
A();

B();
}
{code}
There's no need to put {{B()}} in it's own task because it already is in the task that includes {{foo()}}.

h2. 3. Create your own _taskgroup_ construct

According to the OpenMP 3.0 specifications, "_the {{taskwait}} construct specifies a wait on the completion of child tasks generated since the beginning of the current task_". What to do if you want a point that waits on the completion of a subset of the child tasks generated instead of all child tasks generated so far?

For example, suppose we have the following piece of code.
{code}
/* Compute f2 (A, f1 (B, C))
*/
int foo ()
{
int a, b, c, x, y;

a = A();
b = B();
c = C();
x = f1(b, c);
y = f2(a, x);

return y;
}
{code}
Let's assume that {{A(), B(), C(), f1(), f2()}} can all be executed concurrently. However, because of the data flow, we cannot start executing {{f1()}} until {{B()}} and {{C()}} finish, and cannot start executing {{f2()}} until {{A()}} and {{f1()}} finish, as illustrated below.
{noformat}
A -----------------+
B ------+ |--> f2
|---> f1 --+
C ------*
{noformat}
If we create three tasks (one for {{A(), B()}} and {{C()}} each), then we want to have a point that waits for {{B()}} and {{C()}}, but not for {{A()}}. So we can start {{f1()}} after this point. In otherwords, we want a _taskgroup_ construct that allows us to do the following
{code}
/* Compute f2 (A, f1 (B, C))
*/
void foo ()
{
int a, b, c, x, y;

#pragma omp task shared(a)
a = A();

#pragma omp _taskgroup_
{
#pragma omp task shared(b)
b = B();

#pragma omp task shared(c)
c = C();
}
x = f1 (b, c);

#pragma omp taskwait

y = f2 (a, x);
}
{code}

However, OpenMP 3.0 tasking does not provide such a _taskgroup_ construct. All is not lost, however. We can simulate the _taskgroup_ construct by using a task construct with an {{if (0)}} clause and a {{taskwait}} construct as illustrated below

{code}
#pragma omp _taskgroup_ #pragma omp task if (0)
{ {
... ====> ...
#pragma omp taskwait
} }
{code}

We can now write our code like the following,
{code}
/* Compute f2 (A, f1 (B, C))
*/
void foo ()
{
int a, b, c, x, y;

#pragma omp task shared(a)
a = A();

#pragma omp task if (0) shared (b, c, x)
{
#pragma omp task shared(b)
b = B();

#pragma omp task shared(c)
c = C();

#pragma omp taskwait
}
x = f1 (b, c);

#pragma omp taskwait

y = f2 (a, x);
}
{code}


h2. 4. Do not use tasks where you can use worksharing {{for}} loops

In most programs, each of iteration of an OpenMP worksharing {{for}} loop can be executed concurrently and still gives you the correct result. Therefore, if you convert a worksharing {{for}} loop into a loop with tasks (see the following example), you will probably still get the correct result.
{code}
/* An OpenMP worksharing for loop */
#pragma omp for
for (i=0; i<n; i++) {
foo(i);
}

/* The above loop converted to use tasks */
#pragma omp single nowait
{
for (i=0; i<n; i++) {
#pragma omp task firstprivate(i)
foo(i);
}
}
{code}
But don't do that just because you can. A worksharing {{for}} loop (especially one that uses static scheduling) usually has lower overhead per iteration than a task loop.

Use a task construct for parallel _while_ loops, such as pointer chasing loops, that can not be expressed efficiently using worksharing {{for}} loops. Do not replace worksharing {{for}} loops with task loops.


h2. 5. How to do reductions in tasks?

Say you have a set of items, some of which are 'good' and the others are 'bad'. You want to find out the number of 'good' items. The set is implemented using a linked list. The sequential loop is
{code}
int count_good (item_t *item)
{
int n = 0;
while (item) {
if (is_good(item))
n ++;
item = item->next;
}
return n;
}
{code}
You can use a task loop to parallelize the while loop. How about the increment of {{n}}?

The {{atomic}} construct is probably the most efficient way of doing reductions in tasks.
{code}
int count_good (item_t *item)
{
int n = 0;
#pragma omp parallel
{
#pragma omp single nowait
{
while (item) {
#pragma omp task firstprivate(item)
{
if (is_good(item)) {
#pragma omp atomic
n ++;
}
}
item = item->next;
}
}
}
return n;
}
{code}
Or you can use a thread specific variable to hold the partial sum per-thread and then do a cross-thread reduction to get the total sum.
{code}
int count_good (item_t *item)
{
int n = 0;
int pn[P]; /* P is the number of threads used. */
#pragma omp parallel
{
pn[omp_get_thread_num()] = 0;
#pragma omp single nowait
{
while (item) {
#pragma omp task firstprivate(item)
{
if (is_good(item)) {
pn[omp_get_thread_num()] ++;
}
}
item = item->next;
}
}
#pragma omp atomic
n += pn[omp_get_thread_num()];
}
return n;
}

{code}
----

The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact