Distributed Unsupervised Learning

This is a code repository for the codes for distributed deep learning architecture for unsupervised learning using autoencoders on the MNIST dataset. This contains three parts and will be updated with more later. This repository is a part of my research internship in the Machine Intelligence Unit, Indian Statistical Institute, Kolkata (link)

What is Distributed Deep Learning

Deep learning is a subset of machine learning. We can think of deep learning as a method for curve fitting. The goal is to find the parameters that best approximate an unknown function that maps an input to its corresponding output.

Example: In facial recognition, an image may be the input, and the name of the character on the image the output.

The growth of Deep Learning and Big Data has made it difficult for the modern systems to process so huge data with minimum time. So the concept of distributed deep learning was brought up.

Why do we need Distributed Deep Learning?

There are several reasons for the sudden populatrity of Distributed Deep Learning among the research community:

Big Data : Data is growing day by day. The increase in amount of data is unimaginable. The ImageNet dataset itself contains millions of data and that is one among the thousands available
Parameter Storage : Modern deep learning models contain from a few thousand layers to millions of layers which makes the storage of parameters a huge pain. So training and storing (fit) the model in a single system is a highly inefficient task
Computation : The huge models with TBs of data require huge computations and the deskotp PCs and workstations itself cannot provide the power to efficiently do the calculations. This is one of the most important reasons to call distributed deep learning into practice

We will be using Autoencoders maily for the implimentation

Autoencoders

Autoencoders are a fairly simple deep learning model. Autoencoders are deep neural networks used to reproduce the input at the output layer i.e. the number of neurons in the output layer is exactly the same as the number of neurons in the input layer.

In this repository, Distributed deep learning is being implimented using Autoencoders with and without parameter averaging, distributed tensorflow and also using sequential encoder with parameter averaging (under development).

Contents of this repository:

Stochastic Gradient Descent : The method widely used in distributed deep learning for gradient calculation is the Stochastic Gradient Descent. This is a program from scratch for SGD.(link)
Auto Encoder with Tensorflow : It's a regular Autoencoder model with TensorFlow for beginners. If you are good with autoencoders you can skip this. (link)
Distributed TensorFlow : Implimentation of Autoencoders with parameter server and parameter averaging using the distributed tensorflow model with 2 worker servers and 1 parameter server.(link)
Sequential Autoencoder : This is the implimentation of the parameter server based autoecoder with sequential based algorithm using Mutex locks. Here, two autoencoders are used and run one after the other on minibatches and the parameter is averaged in every run and stored and redistributed. Also this implimentation is done from scratch i.e. no external libraries are used for this.(link)
Research Papers : Important research papers on the topic have been added to this folder for reference. Going through this papers will give a better understanding of the project(link)
MNIST Dataset : Finally the MNIST dataset which is the main dataset used for this research.(link)

Dependencies:

Numpy
Scipy
Matplotlib (To visualize the images)
Scikit-Learn
Keras
TensorFlow

Feel free to fork or create a pull request. Star the repository if you like it. I'm just a beginner, any issues can be put in the issues section and I'll take a look.

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
AutoEncoder_TensorFlow_MNIST		AutoEncoder_TensorFlow_MNIST
Auto_Encoder_MNIST_Dataset		Auto_Encoder_MNIST_Dataset
Distributed_TensorFlow_MNIST		Distributed_TensorFlow_MNIST
MNIST_data		MNIST_data
Research_Papers		Research_Papers
Sequential_AutoEncoder		Sequential_AutoEncoder
Sequential_Autoencoder_Keras		Sequential_Autoencoder_Keras
Stochastic_Gradient_Descent		Stochastic_Gradient_Descent
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Unsupervised Learning

What is Distributed Deep Learning

Why do we need Distributed Deep Learning?

Autoencoders

Contents of this repository:

Dependencies:

About

Releases

Packages

Contributors 2

Languages

License

immanuelsavio/Distributed-Unsupervised-Learning

Folders and files

Latest commit

History

Repository files navigation

Distributed Unsupervised Learning

What is Distributed Deep Learning

Why do we need Distributed Deep Learning?

Autoencoders

Contents of this repository:

Dependencies:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages