Skip to content

Agnar22/NeuralNetwork

Repository files navigation

NeuralNetwork

This is a python implementation of a feed-forward neural network using numpy. The network is trained by mini-batch graident descent and uses l2 regularization.

Installation

Clone the repository

git clone https://github.com/Agnar22/NeuralNetwork.git

navigate into the project folder

cd NeuralNetwork

install requirements

pip install -r requirements.txt

if everything went well, you should now be able to run the code

python3 Main.py

Motivation

I created this project to get insight into the mathematics behind backpropagation in neural networks, as well as to learn how to implement it by only using matrix operations. Numpy is used for the matrix operations.

To check if the neural network (both feed forward and backpropagation) was working, I tested it on the MNIST dataset (supplied by tensorflow).

Results


Figure 1: the training- and validation loss for each epoch

The loss for the training- (in blue) and validation data (in yellow) is shown above (Figure 1). It is strictly decreasing, like it should be. The regression loss used was l2 and the classifiacation loss used was quadratic loss.


Figure 2: a matrix showing target (rows) and prediction (columns) by the NN for the validation data

By looking at the prediction matrix for the validation data (Figure 2), you can see that the network easily recognizes 0's and 1's (with an accuracy of 96% and 97% respectively). On the other hand, 7's and 8's proved to be more difficult. The former were often misclassified as 9's whereas the latter where frequently misclassified as 3's and 5's. Both of the misclassifications are understandable; for a neural network that does not consider spatial invariances, like a conv-net, a sloppy handwritten 7 might resemble a 9, and the curves of an 8 might look like the curves of a 3 or a 5.


Figure 3: training- and validation loss and accuracy for each epoch

With a mini-batch size of 64 and 5 epochs, the neural network managed to get a final validation accuracy of 88.97% (Figure 3). Compared to todays conv-nets, which get staggeringly close to a 100% accuracy, this is not impressive. One could argue that further hyperparameter tuning would improve the result a few percent. However, taking into account that the objective of this project was to understand the underlaying maths behinde backpropagation and being able to implement it, I would say that 88.97% is more than sufficient to prove that the implementation is correct.

Other resources

  • This book about neural networks.
  • This short explanation- and implementation of backpropagation from towardsdatascience.
  • Figure 1 from this paper showing how the gradient is calculated for each layer.

License

This project is licensed under the MIT License.

About

A neural network library using numpy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages