Intel® FPGA SDK for OpenCL™ Pro Edition: Programming Guide

ID 683846
Date 12/19/2022
Public
Document Table of Contents

6.1.5. Requirement for Multiple Command Queues to Execute Kernels Concurrently

To execute kernels within the same OpenCL program object concurrently, instantiate a separate in-order command queue for each kernel you want to run concurrently. This can also be achieved by using a single out-of-order command queue and enqueuing independent kernels (no dependencies between each other).

A single in-order command queue can only dispatch a single operation for execution. Subsequent operations are not dispatched until the previous operation is fully complete. Thus, to execute kernels within the same OpenCL program object concurrently, instantiate a separate command queue for each kernel you want to run concurrently.

Similarly, multiple in-order command queues are also required to concurrently execute different transfers (clEnqueueReadBuffer or clEnqueueWriteBuffer), including executing transfers concurrently with kernels. For example, to execute kernel A concurrently with kernel B as well as concurrently with a clEnqueueWriteBuffer transfer, you must create three command queues and enqueue each of the operations in a separate queue. Achieving this in a steady state leads to maximum utilization of the FPGA device.

Out-of-order command queues may also be used to launch buffer writes, reads, and kernel execution concurrently. Enqueueing events that do not have any dependencies on one another into an out-of-order queue schedules them to execute concurrently, with the exception that these events are not blocked by their other dependencies.