#pragma unrollandfuse

Description

This pragma instructs the compiler to attempt an unroll and fuse operation on nested for loops.

Syntax

Read syntax diagramSkip visual syntax diagram>>-#--pragma--+-nounrollandfuse------------+-------------------><
              '-unrollandfuse--(--+---+--)-'
                                  '-n-'

where n is a loop unrolling factor. In C programs, the value of n is a positive integral constant expression. In C++ programs, the value of n is a positive scalar integer or compile-time constant initialization expression. If n is not specified and if -qhot, -qsmp, or -O4 or higher is specified, the optimizer determines an appropriate unrolling factor for each nested loop.

Notes

The #pragma unrollandfuse directive applies only to the outer loops of nested for loop structures that meet the following conditions:

For loop unrolling to occur, the #pragma unrollandfuse directive must precede a for loop. You must not specify #pragma unrollandfuse for the innermost for loop.

You must not specify #pragma unrollandfuse more than once, or combine the directive with nounrollandfuse, nounroll, unroll, or stream_unroll directives for the same for loop.

Specifying #pragma nounrollandfuse instructs the compiler to not unroll that loop.

Examples

  1. In the following example, a #pragma unrollandfuse directive replicates and fuses the body of the loop. This reduces the number of cache misses for array b.
    int i, j;
    int a[1000][1000];
    int b[1000][1000];
    int c[1000][1000];
    
    
    ....
    
    #pragma unrollandfuse(2)
    for (i=1; i<1000; i++) {
        for (j=1; j<1000; j++) {
            a[j][i] = b[i][j] * c[j][i];
        }
    }
    
    The for loop below shows a possible result of applying the #pragma unrollandfuse(2) directive to the loop structure shown above.
    for (i=1; i<1000; i=i+2) {
        for (j=1; j<1000; j++) {
            a[j][i] = b[i][j] * c[j][i];
            a[j][i+1] = b[i+1][j] * c[j][i+1];
        }
    }
    
  2. You can also specify multiple #pragma unrollandfuse directives in a nested loop structure.
    int i, j, k;
    int a[1000][1000];
    int b[1000][1000];
    int c[1000][1000];
    int d[1000][1000];
    int e[1000][1000];
    
    
    ....
    
    #pragma unrollandfuse(4)
    for (i=1; i<1000; i++) {
    #pragma unrollandfuse(2)
        for (j=1; j<1000; j++) {
    			for (k=1; k<1000; k++) {
                a[j][i] = b[i][j] * c[j][i] + d[j][k] * e[i][k];
            }
        }
    }
    

Related information