Intel® FPGA SDK for OpenCL™ Pro Edition: Best Practices Guide

ID 683521
Date 12/19/2022
Public
Document Table of Contents

6.1.5. Removing Loop-Carried Dependencies Caused by Accesses to Memory Arrays

Include the ivdep pragma in your single work-item kernel to assert that accesses to memory arrays do not cause loop-carried dependencies.
During compilation, the Intel® FPGA SDK for OpenCL™ Offline Compiler creates hardware that ensures load and store instructions operate within dependency constraints. An example of a dependency constraint is that dependent load and store instructions must execute in order. The presence of the ivdep pragma instructs the offline compiler to remove this extra hardware between load and store instructions in the loop that immediately follows the pragma declaration in the kernel code. Removing the extra hardware might reduce logic utilization and lower the II value in single work-item kernels.

You can provide more information about loop dependencies by adding the safelen(N) clause to the ivdep pragma. The safelen(N) clause specifies the maximum number of consecutive loop iterations without loop-carried dependencies. For example, #pragma ivdep safelen(32) indicates to the compiler that there are a maximum of 32 iterations of the loop before loop-carried dependencies might be introduced. That is, while #pragma ivdep promises that there are no implicit memory dependency between any iteration of this loop, #pragma safelen(32) promises that the iteration that is 32 iterations away is the closest iteration that could be dependent on this iteration.

  • If all accesses to memory arrays that are inside a loop do not cause loop-carried dependencies, add the line #pragma ivdep before the loop in your kernel code.
    Example kernel code:
    // no loop-carried dependencies for A and B array accesses
    #pragma ivdep
    for (int i = 0; i < N; i++) {
        A[i] = A[i - X[i]];
        B[i] = B[i - Y[i]];
    }
  • To specify that accesses to a particular memory array inside a loop does not cause loop-carried dependencies, add the line #pragma ivdep array (array_name) before the loop in your kernel code.

    The array specified by the ivdep pragma must be a local or private memory array, or a pointer variable that points to a global, local, or private memory storage. If the specified array is a pointer, the ivdep pragma also applies to all arrays that may alias with specified pointer.

    The array specified by the ivdep pragma can also be an array or a pointer member of a struct.

    Example kernel code:

    // No loop-carried dependencies for A array accesses
    // The offline compiler will insert hardware that reinforces dependency constraints for B
    #pragma ivdep array(A)
    for (int i = 0; i < N; i++) {
        A[i] = A[i - X[i]];
        B[i] = B[i - Y[i]];
    }
    
    // No loop-carried dependencies for array A inside struct
    #pragma ivdep array(S.A)
    for (int i = 0; i < N; i++) {
        S.A[i] = S.A[i - X[i]];
    }
    
    // No loop-carried dependencies for array A inside the struct pointed by S
    #pragma ivdep array(S->X[2][3].A)
    for (int i = 0; i < N; i++) {
        S->X[2][3].A[i] = S.A[i - X[i]];
    }
    
    // No loop-carried dependencies for A and B because ptr aliases
    // with both arrays
    int *ptr = select ? A : B;
    #pragma ivdep array(ptr)
    for (int i = 0; i < N; i++) {
        A[i] = A[i - X[i]];
        B[i] = B[i - Y[i]];
    }
    
    // No loop-carried dependencies for A because ptr only aliases with A
    int *ptr = &A[10];
    #pragma ivdep array(ptr)
    for (int i = 0; i < N; i++) {
        A[i] = A[i - X[i]];
        B[i] = B[i - Y[i]];
    }