Traffic Sign Recognition

Mario Lüder

Writeup - Build a Traffic Sign Recognition Project

The goals / steps of this project are the following:

Load the data set (see below for links to the project data set)
Explore, summarize and visualize the data set
Design, train and test a model architecture
Use the model to make predictions on new images
Analyze the softmax probabilities of the new images
Summarize the results with a written report

This is a link to my project code

I have used the experiences described in paper "Traffic Sign Recognition with Multi-Scale Convolutional Networks" written by Pierre Sermanet and Yann LeCun. This helped a lot to get this exercise done. Nevertheless, I have conducted my own experiements as it can be seen below.

Rubric Points

Here I will consider the rubric points individually and describe how I addressed each point in my implementation.

Data Set Summary & Exploration

The summary is calculated using python with the help of numpy.

The size of training set is 34799
The size of the validation set is 4410
The size of test set is 12630
The shape of a traffic sign image is 32 x 32 x 3
The number of unique classes/labels in the data set is 43

The bar shows the distribution of traffic sign in the training set.

Design and Test a Model Architecture

The original training data looks like this.

As a first step, I decided to convert the images to grayscale as Pierre Sermanet and Yann LeCun wrote in their paper "Traffic Sign Recognition with Multi-Scale Convolutional Networks" that color doesn't contribute to the learning of traffic signs. Like them I have choosen to use the luma channel from YUV color space.

This channel is then processed first by local normalization and then by global normalization. Local normalization increases the contrast within the image, whereas global normalization centers all images around its mean.

Here are examples after grayscaling and normalization.

To increase the variance of the dataset I added four additional images for each training image. The processing pipeline is like this:

translate randomly by [-2, 2] pixel in any direction
rotate by random degree [-15, 15]
scale in any direction by pixels in range [-2, 2]

Final model architecture

My final model consisted of the following layers:

Layer	Description
Input	32x32x1 Grayscale image
Layer 1: Convolution 3x3	1x1 stride, valid padding, outputs 28x28x108
Layer 1: RELU
Layer 1: Max pooling	2x2 stride, outputs 14x14x108
Layer 2: Convolution 3x3	1x1 stride, valid padding, outputs 10x10x108
Layer 2: RELU
Layer 2: Max pooling	2x2 stride, outputs 5x5x108
Dropout Layer 1	Keep Propability: 0.5
Dropout Layer 2	Keep Propability: 0.5
Flatten and Combine	outputs 57153600 x 1
Fully connected
Classifier Layer 1: Network	57 Hidden Units
Classifier Layer 1: RELU
Classifier Layer 1: Dropout	Keep Propability: 0.5
Classifier Layer 2: Network	43 Hidden Units
Softmax cross entropy with logits
Loss operation	reduce mean
Optimizer	AdamOptimizer learning rate 0.0002

Model Training

Layout

I started with the LeNet model given in the excersices, it it showed already good results to start with.

Epoch	Batch Size	Learn Rate	Normalize	Arch	Picture	Layer 1	Layer 2	Classifier L1	Classifier L2	Classifier L3	Accuracy Validation
30	32	0,0005	Global	SS	Y Channel	6	16	120	84	43	0.944
30	32	0,0005	Global	SS	Gray scale	6	16	120	84	43	0.939

The goal was to see the effect of simple gray scaling the traffic signs versus using the luma channel. As the luma channel showed better performance I worked with that.

In a next attempt I changed the layout of the network similar to the recommendation from Pierre Sermanet and Yann LeCun. This impoved the performance way more as you can see in the next table.

Epoch	Batch Size	Learn Rate	Normalize	Arch	Picture	Layer 1	Layer 2	Classifier L1	Classifier L2	Classifier L3	Accuracy Validation
30	32	0,0005	Global	SS	Y Channel	6	16	120	84	43	0.944
30	32	0,0005	Global	SS	Y Channel	108	200	43			0.957
30	32	0,0005	Global	SS	Y Channel	108	200	57	43		0.958

Local Normalization

Another performance gain gave local normalization of training images.

Epoch	Batch Size	Learn Rate	Normalize	Arch	Picture	Layer 1	Layer 2	Classifier L1	Classifier L2	Accuracy Validation
30	32	0,0005	Global	SS	Y Channel	108	200	57	43	0.958
30	32	0,0005	Local + Global	SS	Y Channel	108	200	57	43	0.983

This impoves the contrast of edges a lot, even if pictures histogram looks already widespread. This does not only help the human eye to recognize a picture but is also benefitial for the network as features are exposed.

Dropout

Learning without dropout looks like this. The accuracy increases fast but gets never stable. Lowering the learning rate didn't smooth the graph as desired.

Indroducing dropout after convolution and between the two classification layers smoothed the learning and gave again another performance boost.

Epoch	Batch Size	Learn Rate	Normalize	Arch	Picture	Layer 1	Layer 2	Classifier L1	Classifier L2	Accuracy Validation
30	32	0,0002	Local + Global	SS	Y Channel	108	200	57	43	0.994

** Testing Accuracy: 0.984 **

MS versus SS Architecture

At last, I have implemented the MS architecture described by Pierre Sermanet and Yann LeCun.

Epoch	Batch Size	Learn Rate	Normalize	Arch	Picture	Layer 1	Layer 2	Classifier L1	Classifier L2	Accuracy Validation
30	32	0,0002	Local + Global	MS	Y Channel	108	108	57	43	0.993

** Testing Accuracy: 0.972 **

Unlike the experience from Pierre Sermanet and Yann LeCun I could not see at first sight that the MS architectue improves the performance of the network. I have choosen snow covered traffic signs for a final test. As those were better recognized by the MS architecture, I decided to choose this MS even though the test accuracy is lower. I belive the SS architecture was already overfitting.

Final Results

My final model results were:

validation set accuracy of 0.993
test set accuracy of 0.972

Test a Model on New Images

Here are ten German traffic signs that I found on the web:

I have choosen also snow covered, rotated and low contrast traffic signs to see if the network is able to classify them properly.

Result

The model was able to classified all traffic signs correctly and was in all cases 100 percent sure. Even with the partly visible snow covered one.

Please see the softmax probabilities in the following picture.

The code for making predictions on my final model is located in the 12th cell of the Ipython notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
examples		examples
report		report
traffic-sign-data-test		traffic-sign-data-test
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
Comparison.ods		Comparison.ods
LICENSE		LICENSE
README.md		README.md
README.md.orig		README.md.orig
TrafficSignTest.odt		TrafficSignTest.odt
Traffic_Sign_Classifier.ipynb		Traffic_Sign_Classifier.ipynb
checkpoint		checkpoint
lenet-traffic-signs-model.data-00000-of-00001		lenet-traffic-signs-model.data-00000-of-00001
lenet-traffic-signs-model.index		lenet-traffic-signs-model.index
lenet-traffic-signs-model.meta		lenet-traffic-signs-model.meta
set_git.sh		set_git.sh
signnames.csv		signnames.csv
visualize_cnn.png		visualize_cnn.png
writeup_template.md		writeup_template.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Traffic Sign Recognition

Writeup - Build a Traffic Sign Recognition Project