Skip to content

2015xli/Caffe_LSTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple LSTM network tutorial code with Caffe for MNIST

Lots of people have confusions when using Caffe to implement LSTM network. The python code here uses MNIST as an example to show how to use Caffe LSTM operator. The simple network can achieve 96% validation accuracy in 4 epochs. To run it, invoke "train.py" in command line.

The key confusion usually comes from the data structure requirement in Caffe LSTM, where the input shape for a batch is (time_step, batch_size, data_shape), instead of (batch_size, time_step, data_shape). In this tutorial code, the original data shape of an image for Caffe is (1, 28, 28). When it is split into a sequence of steps, say 7, the new data_shape becomes (1, 7, 4, 28) or (7, 1, 4, 28), indicating 7 steps of sub-image data (1, 4, 28). If it has a batch_size 300, the vanila input shape is (300, 7, 1, 4, 28). Now for Caffe LSTM, it should be (7, 300, 1, 4, 28). That is, at every time step, 300 sub-images (1, 4, 28) are fed into the network. Since the sub-images of an image should be fed into the network in order, the data (300, 7, 1, 4, 28) cannot be np.reshaped into (7, 300, 1, 4, 28); instead, np.transpose(1,0,2,3,4) or np.swapaxes(0,1) should be used.

Other questions in using Caffe LSTM like how to deal with its output, how to connect it with other layers etc., can also get answers from the tutorial code.

Note, the network model prototxt file is generated at runtime and saved to current directory. One can use it directly, while losing a little bit flexibility, since the model is associated with specific input data shape. The code generates the model and the input data using same set of a few variables, so that they always match with each other.

About

Simple tutorial code with Caffe LSTM for MNIST

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages