oneDNN

latest doc

Programming Model

Basic Concepts

In essence, the oneDNN programming model consists in executing one or several primitives to process data in one or several memory objects. The execution is performed on an engine in the context of a stream.

Primitives

A primitive is a functor object that encapsulates a particular computation. Additionally, using primitive attributes certain primitives can represent more complex fused computations.

The most important difference between a primitive and a pure function is that a primitive can store state.

One part of the primitive’s state is immutable. This approach allows oneDNN primitives to pre-generate code specifically tailored for the operation to be performed.
The mutable part of the primitive’s state is referred to as a scratchpad. It is a memory buffer that a primitive may use for temporary storage only during computations. The scratchpad can either be owned by a primitive object (which makes that object non-thread safe) or be an execution-time parameter.

Engines is an abstraction of a computational device.

Streams encapsulate execution context tied to a particular engine.

Memory objects encapsulate handles to memory allocated on a specific engine.

Primitive Attributes

Post-ops

Eltwise
Sum
Depthwise
Binary

Different post-ops can be chained together by appending one after another. Note that the appending order matters: the sequence of the post operations is executed in the order of appearance. The maximum number of post operations supported in the library is 32. Moreover, the support might also depend on the actual implementation of a primitive. For instance, the library may not support post-ops for primitive reference implementations.

Interoperability with DPC++ and OpenCL

DPC++ Interoperability

The mapping between oneDNN and SYCL objects is provided in the following table:

oneDNN object	SYCL object(s)
Engine	`cl::sycl::device` and `cl::sycl::context`
Stream	`cl::sycl::queue`
Memory	`cl::sycl::buffer<uint8_t, 1>`

API to Construct oneDNN Objects

oneDNN object	API to construct oneDNN object
Engine	dnnl::engine(kind, sycl_dev, sycl_ctx)
Stream	dnnl::stream(engine, sycl_queue)
Memory	dnnl::memory(memory_desc, engine, sycl_buf)

API to Access SYCL Objects

oneDNN object	API to access SYCL object(s)
Engine	dnnl::engine::get_sycl_device() and dnnl::engine::get_sycl_context()
Stream	dnnl::stream::get_sycl_queue()
Memory	dnnl::memory::get_sycl_buffer()

Advanced Topics

Understanding Memory Formats

Plain data formats

from inner-most to outer-most

NHWC: offset_nhwc(n, c, h, w) = n * HWC + h * WC + w * C + c

In this case the inner-most dimension is channels ([b:0]) that is followed by width ([b:1]), height ([b:2]), and finally batch ([b:3]).
NCHW: offset_nchw(n, c, h, w) = n * CHW + c * HW + h * W + w

Blocked layout

In order to achieve better vectorization and cache reuse, oneDNN introduces blocked layout that splits one or several dimensions into the blocks of fixed size. The most popular oneDNN data format is nChw16c on AVX512+ systems and nChw8c on SSE4.1+ systems.

The offset function for nChw8c is:

offset_nChw8c(n, c, h, w) = n * CHW
                          + (c / 8) * HW*8
                          + h * W*8
                          + w * 8
                          + (c % 8)

Note that blocks of 8 channels are kept contiguously in memory. Pixel by pixel the spatial domain is covered. Then next slice covers the subsequent 8 channels. Once all channel blocks are covered, the next image in the batch appears.

Zero-padding: to round the channels up to make them multiples of the block size and pad the resulting tail with zeros. (The actual size should always be queried via dnnl::memory::desc::get_size() in C++)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

oneDNN.md

oneDNN.md

oneDNN

Programming Model

Basic Concepts

Primitive Attributes

Post-ops

Interoperability with DPC++ and OpenCL

DPC++ Interoperability

Advanced Topics

Understanding Memory Formats

Plain data formats

Blocked layout

Collapse file tree

Files

oneDNN.md

Latest commit

History

oneDNN.md

File metadata and controls

oneDNN

Programming Model

Basic Concepts

Primitive Attributes

Post-ops

Interoperability with DPC++ and OpenCL

DPC++ Interoperability

Advanced Topics

Understanding Memory Formats

Plain data formats

Blocked layout