Skip to content

Ste29/uncertainty-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Uncertainty analysis - Bayesian Neural Networks

Table of contents

General info

To quickly run the notebooks it is recomended to open them in google colab and then associate your google drive account, the notebooks automatically creates two folders, one for tensorboard savings, the other one for weights.

Theorical background

CNN needs a huge amount of data otherwise it tends to overfit and to make overconfident predictions, for these reasons BNN were introduced, also the great advantage of BNN is the possibility to measure the uncertainty in the prediction. The idea behind Bayesian modeling is to consider all possible values for the parameters of a model when making predictions. In order to do so, the parameters are treated as random variables whose distribution is such that the most likely values are also the most probable ones. The network, starting from a prior distritibution of the model parameters, learns the posterior distribution through Bayers theorem applied on training data.

Predictions are made by weighting the predictions made with every parameter setting using the corresponding probability, that is the posterior probability. Since we do not know the posterior probability, we use Monte Carlo samples from the approximating proposed distribution. The probability distributions over weights are assumed as Gaussians, therefore they have mean μ and a variance σ². The mean μ is the most probable value we sample for the weight, whereas the variance can be seen as a measure of uncertainty. To be more exact, the variance is the sum of two kind of uncertainties:

  • ALEATORIC UNCERTAINTY: a measure of the goodness of the dataset. In particular this is the uncertainty about the observation y caused by a noisy dataset {x,y}.
  • EPISTEMIC UNCERTAINTY: a measure of the goodness of the model. That means that epistemic uncertainty captures our ignorance about the models most suitable to explain our data.

BNNs are trained using Bayed-by-backprop.

Results

The architecture of the network was composed of an input layer, 2 convolutional bayesian layers alternating with 2 max pooling layers, then a fully connected bayesian layer and last the output layer.

On the test set the network achieved 98.1% accuracy, then selecting 0.15 as threshold every prediction where the network wasn't sure was discarded. In particular, 2.5% of the predictions were discarded, achieving 99.86% accuracy on the remaining images. When the net was sure about the prediction, the classification was almost always correct.

res

These images were wrongly predicted and discarded.

cin_incerto set_incerto

However, in some cases the net predicted the right outcome, but discarded the images due to high uncertainty

sei_giusto cin_giusto

And lastly, some images were wrongly predicted, but with low uncertainty

nov_sbagliato cin_sbagliato

The network was then tested on a completely different dataset: emnist, wich is composed of letters instead of numbers. A classic CNN would have predicted the results according the knowledge got from mnist dataset and therefore the results would have been bad. Indeed, CNN achieve 99% accuracy on mnist and 1.9% on emnist. It would have been useful if a classifier could recognize when it should not take a decision, i.e. different kind of data from its training set, for this reason the BNN was tested also on emnist. In particular, the BNN refused to predict 60% of the images, wich means the net recognized in the majority of cases it wasn't able to classify such kind of data.

emn

Many of the predicted cases were images similar to those of the mnist like the following ones:

d s

About

Bayesian Neural Networks applied on mnist and emnist

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published