Loop unrolling consists of replicating the body of a loop in order to reduce the number of iterations required to complete the loop. The UNROLL directive indicates to the compiler that the DO loop that immediately follows the directive can be unrolled.
The UNROLL, NOUNROLL and UNROLL(n) directives take precedence over the -qunroll and -qnounroll compiler options.
>>-+-UNROLL--+---------------------+-+------------------------->< | '-(--unroll_factor--)-' | '-NOUNROLL------------------------'
The UNROLL directive permits the compiler to perform unrolling on the specified loop, but the compiler is not required to perform this activity.
Unrolling is only allowed on the innermost DO loop, that is, a loop that contains no other DO loops. You cannot unroll loops that are introduced by implied-DO loops or the use of Fortran array language. Also, you cannot specify the UNROLL or UNROLL(unroll_factor) directives for DO WHILE loops or infinite DO loops.
The UNROLL and UNROLL(unroll_factor) directives should immediately precede an innermost DO loop.
The UNROLL(unroll_factor) directive requests that the compiler perform unrolling on the innermost loop unroll_factor times, if unroll_factor is greater than 1. If unroll_factor is 1, unrolling is disabled.
Specifying NOUNROLL is equivalent to specifying UNROLL(1).
You cannot specify more than one UNROLL, UNROLL (unroll_factor), or NOUNROLL directive on the same DO loop.
Example 1: In this example, the UNROLL(2) directive is used to tell the compiler that the body of the loop can be replicated so that the work of two iterations is performed in a single iteration. Instead of performing 1000 iterations, if the compiler unrolls the loop, it will only perform 500 iterations.
!IBM* UNROLL(2) DO I = 1, 1000 A(I) = I END DO
If the compiler chooses to unroll the previous loop, the compiler translates the loop so that it is essentially equivalent to the following:
DO I = 1, 1000, 2 A(I) = I A(I+1) = I + 1 END DO
Example 2: In the first DO loop, UNROLL(3) is used. If unrolling is performed, the compiler will unroll the loop so that the work of three iterations is done in a single iteration. In the second DO loop, the compiler determines how to unroll the loop for maximum performance.
PROGRAM GOODUNROLL INTEGER I, X(1000) REAL A, B, C, TEMP, Y(1000) !IBM* UNROLL(3) DO I = 1, 1000 X(I) = X(I) + 1 END DO !IBM* UNROLL DO I = 1, 1000 A = -I B = I + 1 C = I + 2 TEMP = SQRT(B*B - 4*A*C) Y(I) = (-B + TEMP) / (2*A) END DO END PROGRAM GOODUNROLL