GitHub - 2015xli/Caffe_LSTM: Simple tutorial code with Caffe LSTM for MNIST

Simple LSTM network tutorial code with Caffe for MNIST

Lots of people have confusions when using Caffe to implement LSTM network. The python code here uses MNIST as an example to show how to use Caffe LSTM operator. The simple network can achieve 96% validation accuracy in 4 epochs. To run it, invoke "train.py" in command line.

The key confusion usually comes from the data structure requirement in Caffe LSTM, where the input shape for a batch is (time_step, batch_size, data_shape), instead of (batch_size, time_step, data_shape). In this tutorial code, the original data shape of an image for Caffe is (1, 28, 28). When it is split into a sequence of steps, say 7, the new data_shape becomes (1, 7, 4, 28) or (7, 1, 4, 28), indicating 7 steps of sub-image data (1, 4, 28). If it has a batch_size 300, the vanila input shape is (300, 7, 1, 4, 28). Now for Caffe LSTM, it should be (7, 300, 1, 4, 28). That is, at every time step, 300 sub-images (1, 4, 28) are fed into the network. Since the sub-images of an image should be fed into the network in order, the data (300, 7, 1, 4, 28) cannot be np.reshaped into (7, 300, 1, 4, 28); instead, np.transpose(1,0,2,3,4) or np.swapaxes(0,1) should be used.

Other questions in using Caffe LSTM like how to deal with its output, how to connect it with other layers etc., can also get answers from the tutorial code.

Note, the network model prototxt file is generated at runtime and saved to current directory. One can use it directly, while losing a little bit flexibility, since the model is associated with specific input data shape. The code generates the model and the input data using same set of a few variables, so that they always match with each other.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
cnn_lstm.py		cnn_lstm.py
mnist.py		mnist.py
solver.prototxt		solver.prototxt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple LSTM network tutorial code with Caffe for MNIST

About

Releases

Packages

Languages

2015xli/Caffe_LSTM

Folders and files

Latest commit

History

Repository files navigation

Simple LSTM network tutorial code with Caffe for MNIST

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages