This documentation describes the operator definitions.
- BatchNorm
- Concat
- ConstOp
- Convolution
- Deconvolution
- Detection_output
- Dropout
- Eltwise
- Flatten
- Fully_connected
- LRN
- Input_op
- Normalize
- Permute
- Pooling
- Priorbox
- PReLu
- Region
- Reorg
- ReLu
- Reshape
- RoiPooling
- RPN
- Scale
- Slice
- Softmax
BatchNorm operator carries out batch normalization, only for inference phase.
Inputs:
-
input
: float32the input 4-dimensional tensor of shape NCHW
-
gamma
: float32 -
beta
: float32the bias as 1-dimensional tensor of size C(Channel)
-
mean
: float32the estimated mean as 1-dimensional tensor of size C(Channel)
-
var
: float32the estimated variance as 1-dimensional tensor of size C(Channel)
Outputs:
output
: float32
Parameters:
-
caffe_flavor
: intif use caffe version batch_normalization. Default set to 1.
-
rescale_factor
: float32
Concatenate a list of input tensors into a single output tensor.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
-
axis
: intwhich axis to concat on, default is set to 1.
A Constant tensor.
Inputs:
None
Outputs:
output
: float32
It computes the output of convolution of an input tensor and filter kernel.
Inputs:
input
: float32weight
: float32bias
: float32
Outputs:
output
: float32
Parameters:
-
kernel_h
: intkernel size height
-
kernel_w
: int -
kernel size width
-
stride_h
: intstride size height
-
stride_w
: int -
stride size width
-
pad_h
: intpad size height
-
pad_w
: intpad size width
-
dilation_h
:int -
dilation_w
:int -
output_channel
: intnumber of output channel (number of kernel)
-
group
: int
Deconvolution operator multiplies each input value by a kernel elementwise, and sums over the resulting on output windows.
Inputs:
input
: float32weight
: float32bias
: float32
Outputs:
output
: float32
Parameters:
-
kernel_size
: int -
stride
: intstride size
-
pad
: intpad size
-
num_output
: intnumber of output channel (number of kernel)
-
dilation
: int
Detection_output operator used in SSD-detection network.
Inputs:
input
: float32
Outputs:
output
: float32 which is coordinates of detected boxes and corresponding confidences for each class
Parameters:
-
num_classes
: intnumber of classes of detection benchmark (21 for VOC and 81 for COCO)
-
confidence_threshold
: float -
nms_threshold
: float -
keep_top_k
: intnum of top_k keeping for results of nms
Dropout operator for inference phase is Y=X.
Inputs:
input
: float32
Outputs:
output
: float32
Compute elementwise operations, such as max or sum, along multiple input tensors.
Inputs:
input1
: float32input2
: float32
Outputs:
output
: float32
Parameters:
type
: enum (MAX, SUM)
The Flatten operator is a utility op that flattens an input of shape [n, c, h, w] to a simple vector output of shape [n, (chw),1, 1].
Inputs:
input1
: float32
Outputs:
output
: float32
Parameters:
axis
: int
Fully-connnected computes the results of X*W+b with X as input,W as weight and b as bias.
Inputs:
input
: float32weight
: float32bias
: float32
Outputs:
output
: float32
Parameters:
-
num_output
: intnumber of output, which is the size of bias
Inpute operator to feed data into network
Inputs:
None
Outputs:
output
: float32
Local Response Normalization normalizes over local input regions.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
local_size
: intnorm_region
: intalpha
: floatbeta
: floatk
: float
Normalize operator normalizes the input alone channel axis with L2 normalization, used in SSD-detection network
Inputs:
input
: float32scale
: float32
Outputs:
output
: float32
Permute operator permutes the input with specific order, used in SSD-detection network.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
order0
: intorder1
: intorder2
: intorder3
: int
Pooling takes input tensor and applies pooling according to the kernel sizes, stride sizes, pad sizes and pooling types.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
-
alg
: enumpooling type: kPoolMax, kPoolAvg
-
global
: intif use global pooling
-
caffe_flavor
: int -
kernel_shape
: a list of intthe size of the kernel along each axis (H, W)
-
strides
: a list of intstride along each axis (H, W).
-
pads
: a list of intpads zero for each axis (x1_begin, x2_begin...x1_end, x2_end,...). In case of input of shape NCHW, the pads is (pad_top,pad_left,pad_bottom,pad_right)
Priorbox operator computes the prior boxes for SSD (single shot detection) network. It will compute the prior boxes according to the original image size or specific image size defined by proto, as well as according to other parameters: max box size, min box size, aspect ratio for box etc.
Inputs:
input
: float32image_width
:int32image_height
:int32
Outputs:
output
: float32
Parameters:
min_size
: float32max_size
: float32variance
: float32aspect_ratio
: float32flip
: int32clip
: int32img_size
: int32img_h
: int32img_w
: int32
ReLu(Parameterized Rectified Linear Unit) takes one input data (Tensor) and produces one output data (Tensor) through yi=max(0,xi)+slope_i*min(0,xi)
with slopes for negative parts.
Inputs:
input
: float32slope
: float32
Outputs:
output
: float32
Region operator is used in YOLO network. It is a post process for the network output to rescale the output into [0,1] for compute the final detection out boxes.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
num_classes
: int32side
: int32num_box
: int32coords
: int32confidence_threshold
: float32nms_threshold
: float32biases
: float32
Reorg operator is used in YOLO network. It is a process for the network to re-organize the data according the stride.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
stride
: int32
The Reshape operator can be used to change the dimensions of its input, without changing its data.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
dims
: a list of int32
Relu takes one input data (Tensor) and produces one output data (Tensor) where the rectified linear function, y = max(0, x), is applied to the tensor elementwise.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
-
negative_slope
: floatthe relu with negative_slope = 0.1, is also called leaky-activation
Region Proposal Network(RPN) operator used in Faster-RCNN network. It generates proposal anchors.
Inputs:
-
input0
: float32 scoretensor -
input1
: float32 featmap tensor
Outputs:
output
: float32
Parameters:
-
nms_thresh
: float32 -
post_nms_topn
: intpostprocess_nms_topn
-
per_nms_topn
: int preprocess_nms_topn -
min_size
: int -
basesize
: int -
feat_stride
: int -
anchor_scales
:a list of anchor scales
-
ratios
:a list of ratios
Roi_pooling operator used in Faster-RCNN network. It performs max pooling on regions of interest(ROI) specified by input.
Inputs:
-
input0
: float32input0 is [N x C x H x W] feature maps on which pooling is performed.
-
input1
: float32Input[1] [ R x 4] contains a list R ROI with each 4 coordinates.
Outputs:
output
: float32
Parameters:
-
pooled_h
: intThe pooled output height.
-
pooled_w
: intThe pooled output width.
-
spatial_scale
: float Multiplicative spatial scale factor to translate ROI coordinates from their input scale to the scale used when pooling.
Scale operator computes the output as scaling the input Y=gamma*X+(bias).
Inputs:
input
: float32gamma
: float32beta
: float32
Outputs:
output
: float32
Parameters:
-
axis
: intwhich axis to coerce the input into 2D, default is set to 1.
-
num_axes
: intdefault set to 1
-
bias_term
: intdefault set to 0
Slice op takes an input and slices it along either the num or channel dimension, outputting multiple sliced tensors.
Inputs:
input
: float32
Outputs:
output1
: float32output2
: float32
Parameters:
-
axis
: intwhich axis to slice along, default is set to 1 (Channel).
Softmax computes the softmax normalized values. The output tensor has the same shape of the input shape.
Inputs:
input
: float32
Outputs:
output
: float32
Parameters:
-
axis
: intwhich axis to coerce the input into 2D, default is set to 1.