{section:border=true}
{column:width=70%}
h2. Loop Collapse in OpenMP 3.0
The COLLAPSE clause can appear on a do/for loop directive. It is used to parallelize perfectly nested loops without using nested parallelism. The COLLAPSE clause takes a constant positive integer expression as parameter. The parameter specifies how many loops should be collapsed by the compiler into one loop before parallelizing the resulting loop.
The COLLAPSE clause is described in Section 2.5.1 (p. 39) of the OpenMP specification.
*Example (Fortran):*
{noformat}
program test
use omp_lib
integer, parameter :: N = 10
integer i, j, x
call omp_set_dynamic (.false.)
call omp_set_num_threads (4)
x = 10
!$OMP PARALLEL SHARED(x) PRIVATE(i, j)
!$OMP DO COLLAPSE(2)
do i = 1, N
do j = 1, N
!$OMP CRITICAL
x = x + 1
!$OMP END CRITICAL
end do
end do
!$OMP END PARALLEL
print *, "x = ", x
end
% f90 -xopenmp -xO3 test.f
% a.out
x = 110
{noformat}
*Example (C/C++):*
{noformat}
#include <stdio.h>
#include <omp.h>
#define N 10
int main(void)
{
int i, j, x;
omp_set_dynamic(0);
omp_set_num_threads(4);
x = 10;
#pragma omp parallel shared(x) private(i, j)
{
#pragma omp for collapse(2)
for (i = 0; i < N; i = i + 2)
{
for (j = 0; j < N; j = j + 2)
{
#pragma omp atomic
x++;
}
}
}
printf ("x = %d\n", x);
}
% cc -xopenmp -xO3 test.c
% a.out
x = 35
{noformat}
{column}
{column:width=30%}
{panel:title=OpenMP 3.0 Features in Express 7.08}
{children:page=Sun Studio OpenMP}
{panel}
{column}
{section}
{column:width=70%}
h2. Loop Collapse in OpenMP 3.0
The COLLAPSE clause can appear on a do/for loop directive. It is used to parallelize perfectly nested loops without using nested parallelism. The COLLAPSE clause takes a constant positive integer expression as parameter. The parameter specifies how many loops should be collapsed by the compiler into one loop before parallelizing the resulting loop.
The COLLAPSE clause is described in Section 2.5.1 (p. 39) of the OpenMP specification.
*Example (Fortran):*
{noformat}
program test
use omp_lib
integer, parameter :: N = 10
integer i, j, x
call omp_set_dynamic (.false.)
call omp_set_num_threads (4)
x = 10
!$OMP PARALLEL SHARED(x) PRIVATE(i, j)
!$OMP DO COLLAPSE(2)
do i = 1, N
do j = 1, N
!$OMP CRITICAL
x = x + 1
!$OMP END CRITICAL
end do
end do
!$OMP END PARALLEL
print *, "x = ", x
end
% f90 -xopenmp -xO3 test.f
% a.out
x = 110
{noformat}
*Example (C/C++):*
{noformat}
#include <stdio.h>
#include <omp.h>
#define N 10
int main(void)
{
int i, j, x;
omp_set_dynamic(0);
omp_set_num_threads(4);
x = 10;
#pragma omp parallel shared(x) private(i, j)
{
#pragma omp for collapse(2)
for (i = 0; i < N; i = i + 2)
{
for (j = 0; j < N; j = j + 2)
{
#pragma omp atomic
x++;
}
}
}
printf ("x = %d\n", x);
}
% cc -xopenmp -xO3 test.c
% a.out
x = 35
{noformat}
{column}
{column:width=30%}
{panel:title=OpenMP 3.0 Features in Express 7.08}
{children:page=Sun Studio OpenMP}
{panel}
{column}
{section}