- CUDA only supports FP16 precision on Turing Devices.
- x86 only supports FP32 precision on AVX512/FMA.
- ONNX
Op Type | Op Set | Linux X86-64 | Linux CUDA |
---|---|---|---|
Add | 11 | ✓ | ✓ |
And | 11 | ✓ | ✓ |
ArgMax | 11 | ✓ | ✓ |
AveragePool | 11 | ✓ | ✓ |
BatchNormalization | 11 | ✓ | ✓ |
Cast | 11 | ✓ | ✓ |
Ceil | 11 | ✓ | ✓ |
Clip | 11 | ✓ | ✓ |
Concat | 11 | ✓ | ✓ |
Constant | 11 | ✓ | |
ConstantOfShape | 11 | ✓ | ✓ |
Conv | 11 | ✓ | ✓ |
ConvTranspose | 11 | ✓ | ✓ |
DepthToSpace | 11 | ✓ | ✓ |
Div | 11 | ✓ | ✓ |
Equal | 11 | ✓ | ✓ |
Exp | 11 | ✓ | ✓ |
Expand | 11 | ✓ | ✓ |
Flatten | 11 | ✓ | ✓ |
Floor | 11 | ✓ | ✓ |
Gather | 11 | ✓ | ✓ |
GatherND | 11 | ✓ | ✓ |
Gemm | 11 | ✓ | ✓ |
Greater | 11 | ✓ | ✓ |
Identity | 11 | ✓ | ✓ |
If | 13 | ✓ | ✓ |
LeakyRelu | 11 | ✓ | ✓ |
Less | 11 | ✓ | ✓ |
Log | 11 | ✓ | ✓ |
Loop | 13 | ✓ | ✓ |
MatMul | 11 | ✓ | |
Max | 11 | ✓ | ✓ |
MaxPool | 11 | ✓ | ✓ |
MaxUnpool | 11 | ✓ | ✓ |
Min | 11 | ✓ | ✓ |
Mul | 11 | ✓ | ✓ |
NonMaxSuppression | 11 | ✓ | ✓ |
NonZero | 11 | ✓ | ✓ |
Not | 11 | ✓ | ✓ |
Pad | 11 | ✓ | ✓ |
Pow | 11 | ✓ | ✓ |
ReduceMax | 11 | ✓ | ✓ |
ReduceMean | 11 | ✓ | ✓ |
ReduceMin | 11 | ✓ | ✓ |
ReduceProd | 11 | ✓ | ✓ |
ReduceSum | 11 | ✓ | ✓ |
Relu | 11 | ✓ | ✓ |
Reshape | 11 | ✓ | ✓ |
Resize | 11 | ✓ | ✓ |
RoiAlign | 11 | ✓ | ✓ |
ScatterElements | 11 | ✓ | ✓ |
ScatterND | 11 | ✓ | ✓ |
SequenceAt | 13 | ✓ | ✓ |
Shape | 11 | ✓ | ✓ |
Sigmoid | 11 | ✓ | ✓ |
Size | 11 | ✓ | ✓ |
Slice | 11 | ✓ | ✓ |
Softmax | 11 | ✓ | ✓ |
Split | 11 | ✓ | ✓ |
SplitToSequence | 13 | ✓ | ✓ |
Sqrt | 11 | ✓ | ✓ |
Squeeze | 11 | ✓ | ✓ |
Sub | 11 | ✓ | ✓ |
Sum | 11 | ✓ | ✓ |
Tanh | 11 | ✓ | ✓ |
TopK | 11 | ✓ | ✓ |
Transpose | 11 | ✓ | ✓ |
Unsqueeze | 11 | ✓ | ✓ |
Where | 11 | ✓ | ✓ |
- MMCV
Op Type | Op Set | Linux X86-64 | Linux CUDA |
---|---|---|---|
NonMaxSuppression | 1 | ✓ | ✓ |
RoiAlign | 1 | ✓ | ✓ |
grid_sample | 1 | ✓ | ✓ |
- PPL
Op Type | Op Set | Linux X86-64 | Linux CUDA |
---|---|---|---|
ChannelShuffle | 1 | ✓ | ✓ |
Shape | 1 | ✓ | ✓ |
Swish | 1 | ✓ |