% Author: R. Delhome % Date: 18/12/11
Some steps have to be accomplished:
Following folders must be created and maintained:
data/dataset/input/training/images
must contain raw training imagesdata/dataset/input/training/labels
must contain training labels, either as images or text files (json
andgeojson
are possible)data/dataset/input/validation/images
must contain raw validation imagesdata/dataset/input/validation/labels
must contain validation imagesdata/dataset/input/testing/images
must contain testing imagesdata/dataset/preprocessed/
will contain preprocessed material (images and labels) that will be used by neural network modelsdata/dataset/output
will contain neural network outputs (trained models)
- Create a class that inheritates from
Dataset
(for a sake of clarity, declare it in a dedicated module) so as to describe the new dataset - Define the class generator by defining labels on the
Dataset
manner - Define a
populate
method in which images are preprocessed, and exploitable images and labels are generated on the file system (image files with a fixed square size). - Add the new module as a dependency in
datagen.py
- Manage the new dataset creation in
datagen.py
(hint: search for all occurrence ofaerial
ormapillary
to know the accurate place) - Add the dataset name to
AVAILABLE_DATASETS
variable indeeposlandia/datasets/__init__.py
- Consider a little sample of your data (less than 5Mo), and reproduce the
previous steps in
tests/data
folder - Write unit tests for :
- dataset handling (see
tests/test_dataset.py
for examples) - generator verification (see
tests/test_generator.py
for examples)
- dataset handling (see
- Train a neural network model with the new created dataset:
- use
paramoptim.py
for exploring several hyperparametrization and store the best model indata/dataset/output/semantic_segmentation/checkpoint/
. - alternatively use the simpler
train.py
to train a single model. In such a case, you will have to manually copy the trained model from the instance folder to the global checkpoint folder and to create ajson
file that summarizes the model training parameters.
- use
As an example that illustrates the required trained model files, in the
aerial
dataset case we have:
data/aerial/output/semantic_segmentation/checkpoints/best-model-250.h5
that contains the trained model weightsdata/aerial/output/semantic_segmentation/checkpoints/best-instance-250.json
that contains a single dictionary with values of validation accuracy (val_acc
), batch size (batch_size
), network, dropout, learning rate (learning_rate
) and learning rate decay (learning_rate_decay
).
{"val_acc": 0.9586366659402847, "batch_size": 20, "network": "unet", "dropout": 1.0, "learning_rate": 0.001, "learning_rate_decay": 1e-05}
- Link the app
static
folder to image repository:- in a development environment, update the
config.ini
andconfig.ini.sample
files: they will manage a symbolic link creation towards the images - in a production environment: in your app repository, add a bunch of images
into a dedicated folder that contains
images
andlabels
subfolders
- in a development environment, update the
- Add the dataset name in
webapp/main.py
docstring (hint: search for all occurrence ofaerial
ormapillary
to know the accurate places) - Specify the depicted image size for the dataset in
webapp/main.py
(seerecover_image_info
method) - Create a new
html
web page dedicated to the new dataset, on the model of previous datasets - Refer to this webpage by updating
index.html