Skip to content

Latest commit

 

History

History
 
 

> Chapter 7:

Training on Complex and Scarce Datasets

The first task to develop new recognition models is to gather and prepare the training dataset. Building pipelines to let the data flow properly during heavy training phases used to be an art, but TensorFlow recent features made it quite straightforward to fetch and pre-process complex data, as demonstrated in the first notebooks of this Chapter 7. Oftentimes, however, training data can simply be unavailable. The remaining notebooks tackle these scenarios, presenting a variety of solutions.

📓 Notebooks

(Reminder: Notebooks are better visualized with nbviewer: click here to continue on nbviewer.jupyter.org.)

  • 7.1 - Setting up Efficient Input Pipelines with tf.data
    • Harness the latest features of the tf.data API to set up optimized input pipelines to train models.
  • 7.2 - Generating and Parsing TFRecords
    • Discover how to convert complete datasets into TFRecords, and how to efficiently parse these files.
  • 7.3 - (TBD) Rendering Images from 3D Models
    • Get a quick overview of 3D rendering with Python, using OpenGL-based vispy to generate a variety of images from 3D data.
  • 7.4 - (TBD) Apply Domain Adaptation Methods to Bridge the Realism Gap
    • Experiment with some solutions, such as DANN, to train models on synthetic data so that they can be applied to real pictures afterwards.
  • 7.5 - (TBD) Create Images with Variational Auto-Encoders (VAEs)
    • Implement a particular auto-encoder able to generate new real-looking images.
  • 7.6 - (TBD) Create Images with Generative-Adversarial Networks (GANs)
    • Train a generative network against a discriminator one in an unsupervised manner, to augment datasets.

📄 Additional Files

  • cityscapes_utils.py: utility functions for the Cityscapes dataset (code presented in notebook 6.4).
  • fcn.py: functional implementation of the FCN-8s architecture (code presented in notebook 6.5).
  • keras_custom_callbacks.py: custom Keras callbacks to monitor the trainings of models (code presented in notebooks 4.1 and 6.2).
  • mnist_utils.py: utility functions for the MNIST dataset, using tensorflow-datasets (code presented in notebook 6.1).
  • plot_utils.py: utility functions to display results (code presented in notebook 6.2).
  • tf_losses_and_metrics.py: custom losses and metrics to train/evalute CNNs (code presented in notebooks 6.5 and 6.6).
  • tf_math.py: custom mathematical functions reused in other scripts (code presented in notebooks 6.5 and 6.6).
  • unet.py: functional implementation of the U-Net architecture (code presented in notebook 6.3).