This is a code repository for the codes for distributed deep learning architecture for unsupervised learning using autoencoders on the MNIST dataset. This contains three parts and will be updated with more later. This repository is a part of my research internship in the Machine Intelligence Unit, Indian Statistical Institute, Kolkata (link)
Deep learning is a subset of machine learning. We can think of deep learning as a method for curve fitting. The goal is to find the parameters that best approximate an unknown function that maps an input to its corresponding output.
Example: In facial recognition, an image may be the input, and the name of the character on the image the output.
The growth of Deep Learning and Big Data has made it difficult for the modern systems to process so huge data with minimum time. So the concept of distributed deep learning was brought up.
There are several reasons for the sudden populatrity of Distributed Deep Learning among the research community:
-
Big Data : Data is growing day by day. The increase in amount of data is unimaginable. The ImageNet dataset itself contains millions of data and that is one among the thousands available
-
Parameter Storage : Modern deep learning models contain from a few thousand layers to millions of layers which makes the storage of parameters a huge pain. So training and storing (fit) the model in a single system is a highly inefficient task
-
Computation : The huge models with TBs of data require huge computations and the deskotp PCs and workstations itself cannot provide the power to efficiently do the calculations. This is one of the most important reasons to call distributed deep learning into practice
We will be using Autoencoders maily for the implimentation
Autoencoders are a fairly simple deep learning model. Autoencoders are deep neural networks used to reproduce the input at the output layer i.e. the number of neurons in the output layer is exactly the same as the number of neurons in the input layer.
In this repository, Distributed deep learning is being implimented using Autoencoders with and without parameter averaging, distributed tensorflow and also using sequential encoder with parameter averaging (under development).
-
Stochastic Gradient Descent : The method widely used in distributed deep learning for gradient calculation is the Stochastic Gradient Descent. This is a program from scratch for SGD.(link)
-
Auto Encoder with Tensorflow : It's a regular Autoencoder model with TensorFlow for beginners. If you are good with autoencoders you can skip this. (link)
-
Distributed TensorFlow : Implimentation of Autoencoders with parameter server and parameter averaging using the distributed tensorflow model with 2 worker servers and 1 parameter server.(link)
-
Sequential Autoencoder : This is the implimentation of the parameter server based autoecoder with sequential based algorithm using Mutex locks. Here, two autoencoders are used and run one after the other on minibatches and the parameter is averaged in every run and stored and redistributed. Also this implimentation is done from scratch i.e. no external libraries are used for this.(link)
-
Research Papers : Important research papers on the topic have been added to this folder for reference. Going through this papers will give a better understanding of the project(link)
-
MNIST Dataset : Finally the MNIST dataset which is the main dataset used for this research.(link)
- Numpy
- Scipy
- Matplotlib (To visualize the images)
- Scikit-Learn
- Keras
- TensorFlow
Feel free to fork or create a pull request. Star the repository if you like it. I'm just a beginner, any issues can be put in the issues section and I'll take a look.