This simple demo is a basic image classifier based on this Tutorial. This file contains the basic documentation to use this demo from scratch.
The nn-demo folder contains the following subfolders:
classify
, which contains the scripts for testing the classifierdata
, which is designed to contain the starting training setpreprocess
, which contains the scripts for preprocessing the imagestrain
, which contains the scripts for the network training
There are a few prerequisites needed to start using this demo:
- Python 3 (at least Python 3.5.2)
- the
tensorflow
package for python - the
opencv-python
package - the
numpy
package - optional: we also used the
tensorflow-for-poets-2
package, but it's not strictly needed because all the scripts that are necessary to run this project are already in the folder; anyway, you can download it here -> Tensorflow for poets 2
Note: a virtual environment
has been used for this particular project, and we recommend that you try to use it too to avoid any problem of compatibility; if you want to set it up, simply type:
$ python3 -m venv nn-demo-venv
$ source ./nn-demo-venv/bin/activate
After that just install the needed packages with pip
.
Before you start any kind of training, you'll need a set of images to teach the model about the classes you want to recognize. First things first find the basic classes, after that search for at least 5 images per class. Create a subfolder for each class under the data
folder of the project, name each folder after the class that you want to recognize and save the images in the correspondent subfolder.
E.g.: if you want to classify musical instruments, the data
folder should contain some of the following subfolders so far:
data
├── acoustic guitar
│ ├── aguitar1.jpg
│ ├── aguitar2.jpg
│ ├── aguitar3.jpg
│ ├── aguitar4.jpg
│ └── aguitar5.jpg
├── guitar
│ ├── elguitar1.jpg
│ ├── elguitar2.jpg
│ ├── elguitar3.jpg
│ ├── elguitar4.jpg
│ └── elguitar5.jpg
├── piano
│ ├── piano1.jpg
│ ├── piano2.jpg
│ ├── piano3.jpg
│ ├── piano4.jpg
│ └── piano5.jpg
└── sax
├── sax1.jpg
├── sax2.jpg
├── sax3.jpg
├── sax4.jpg
└── sax5.jpg
The preprocessing script (preprocessing/preprocessing.py
) prepares the images of training applying some basic transformations (performed by the transform function in the transform_image.py
). The new images are arranged in subfolders named after their class into the train/dataset
folder. Using the previous example, after the preprocessing step is completed, under the train folder you should see something like this:
dataset
├── acoustic guitar
│ ├── 180aguitar1.jpg
│ ├── 180aguitar2.jpg
│ ├── 180aguitar3.jpg
│ ├── 180aguitar4.jpg
│ ├── 180aguitar5.jpg
│ ├── 180grayaguitar1.jpg
│ ├── 180grayaguitar2.jpg
│ ├── 180grayaguitar3.jpg
│ ├── 180grayaguitar4.jpg
│ ├── 180grayaguitar5.jpg
│ ├── 270aguitar1.jpg
│ ├── 270aguitar2.jpg
│ ├── 270aguitar3.jpg
│ ├── 270aguitar4.jpg
│ ├── 270aguitar5.jpg
│ ├── 270grayaguitar1.jpg
│ ├── 270grayaguitar2.jpg
│ ├── 270grayaguitar3.jpg
│ ├── 270grayaguitar4.jpg
│ ├── 270grayaguitar5.jpg
│ ├── 90aguitar1.jpg
│ ├── 90aguitar2.jpg
│ ├── 90aguitar3.jpg
│ ├── 90aguitar4.jpg
│ ├── 90aguitar5.jpg
│ ├── 90grayaguitar1.jpg
│ ├── 90grayaguitar2.jpg
│ ├── 90grayaguitar3.jpg
│ ├── 90grayaguitar4.jpg
│ ├── 90grayaguitar5.jpg
│ ├── aguitar1.jpg
│ ├── aguitar2.jpg
│ ├── aguitar3.jpg
│ ├── aguitar4.jpg
│ ├── aguitar5.jpg
│ ├── grayaguitar1.jpg
│ ├── grayaguitar2.jpg
│ ├── grayaguitar3.jpg
│ ├── grayaguitar4.jpg
│ └── grayaguitar5.jpg
├── guitar
│ ├── 180chitarra1.jpg
│ ├── 180chitarra2.jpg
│ ├── 180chitarra3.jpg
│ ├── ...
├── piano
│ ├── 180graypiano1.jpg
│ ├── 180graypiano2.jpg
│ ├── 180graypiano3.jpg
│ ├── ...
└── sax
├── 180graysax1.jpg
├── 180graysax2.jpg
├── 180graysax3.jpg
└── ...
Each subfolder contains 8 version of the same image.
Once the training set is ready, the next step is the actual train of the neural network. At this version of the project, there are two models supported for the architecture:
- MobileNet, fast and slim
- Inception-v3, heavier but more accurate to train the network.
Run the script train_mobilenet.sh
to train the network with the MobileNet model, or run train_inception.sh
to train it with the Inception-v3 model. These scripts will create the subfolder tf_files
under the train
directory, which, once the training is completed, will contain the bottlenecks generated by the training.
Note: by default, the MobileNet version used for this demo is the 0.50 with image size of 224; you can change it in every moment in the train_mobilenet.sh
script as follows:
python -m retrain \
--bottleneck_dir=tf_files/bottlenecks \
--how_many_training_steps=500 \
--model_dir=tf_files/models/ \
--summaries_dir=tf_files/training_summaries/mobilenet_YOUR_VERSION \
--output_graph=tf_files/retrained_graph.pb \
--output_labels=tf_files/retrained_labels.txt \
--architecture=mobilenet_YOUR_VERSION \
--image_dir=dataset
After changing this, you should also change these lines of code in the label_image.py
script:
77 if os.path.exists('../train/tf_files/models/mobilenet_v1_0.50_224'):
78 input_height = 224
79 input_width = 224
80 input_layer = "input"
Into something like this:
77 if os.path.exists('../train/tf_files/models/mobilenet_YOUR_VERSION'):
78 input_height = YOUR_VERSION_IMAGE_SIZE
79 input_width = YOUR_VERSION_IMAGE_SIZE
Once the training step has completed, you can finally test your neural network! The testing scripts are stored in the classify
folder; in particular, you have to place the images that you want to classify in classify/test_images
. After that, just run the classify_multiple_images.py
Python script.
The results will be displayed in the shell, in the following form:
Evaluation time (YOUR_TEST_IMAGE): ELAPSED_TIME
CLASS_1 PROBABILITY_OF_BEING_A_CLASS_1_OBJECT
CLASS_2 PROBABILITY_OF_BEING_A_CLASS_2_OBJECT
...
Finally, a csv
file containing the results of the classification will be created in the classify
folder.
E.g.: in the previous example, trying to classify some instruments images, you'll get a something like this:
Evaluation time (test_piano.jpg): 0.387s
piano 0.999998
guitar 1.2078438e-06
sax 7.788592e-07
acoustic guitar 7.4897005e-11
Evaluation time (test_guitar.jpeg): 0.155s
guitar 0.9999089
sax 8.829939e-05
piano 2.61591e-06
acoustic guitar 1.4023973e-07
Evaluation time (test_acoustic_guitar.jpeg): 0.157s
acoustic guitar 0.9999993
guitar 4.1909985e-07
sax 3.4296806e-07
piano 2.8688019e-09