This class implements IndRNN from this article.
It's a simple recurrent unit with the following formula:
Y_t = activation( W * X_t + B + U * Y_t-1 )
where:
W
andB
are weight matrix and free terms of the fully-connected layer respectively (W * X_t
means matrix-by-vector multiplication)U
is a recurrent weights vector (U * Y_t-1
means element-wise multiplication of 2 vectors of same length)activation
is an activation function (sigmoid
orReLU
)
Hidden layer size
void SetHiddenSize( int size );
Sets the hidden layer size. It affects the output size.
void SetDropoutRate( float dropoutRate );
Sets the rate of dropout, applied to both input (X_t
) and recurrent part (Y_t-1
).
void SetReverseSequence( bool reverse );
Elements of the sequences are processed in reversed order if this flag is set.
void SetActivation( TActivationFunction activation );
Sets the activation function used in recurrent part. AF_Sigmoid
by default.
CPtr<CDnnBlob> GetInputWeights() const;
The weight matrix W
from the formula.
It has the following shape:
BatchLength * BatchWidth * ListSize
is equal toGetHiddenSize()
Height * Width * Depth * Channels
is equal to the product of the same dimensions of the input.
CPtr<CDnnBlob> GetRecurrentWeights() const;
The weight vector U
from the formula. It's represented by a blob of the total size GetHiddenSize()
.
CPtr<CDnnBlob> GetBias() const
The free terms B
from the formula. It's represented by a blob of the total size GetHiddenSize()
.
The single input of this layer accepts the set of vector sequences of the following shape:
BatchLength
- the length of one vector sequence.BatchWidth * ListSize
- the number of vector sequences in the input set.Height * Width * Depth * Channels
- the size of each vector in the sequence.
The single output returns a blob of the following size:
BatchLength
,BatchWidth
, andListSize
are equal to the same sizes of the first input.Height
,Width
, andDepth
equal1
.Channels
equalsGetHiddenSize()
.