Skip to content
matthieudelaro edited this page Apr 26, 2016 · 5 revisions

Caffe is a famous Deep Learning Framework used by numerous data scientists. Unfortunately, Caffe can be hard to compile due to its dependencies, which makes it an ideal use case for Nut. This wiki will show you how to classify handwritten digits with Caffe.

Training a model with Caffe requires several steps:

  1. download sources from GitHub
  2. build Caffe from sources, with or without GPU support
  3. choose a model/neural network defined in a prototxt file
  4. define a solver for this model, with or without GPU support
  5. train the model with caffe, based on solver file, and model prototxt file.

Here is how to do this with Nut (installation instructions):

cd examples/caffe    # This folder contains a `nut.yml` file with proper configuration for Caffe
nut download         # Download Caffe source code, and pull docker image (>2Go : it may take some time)
nut build-cpu        # Build Caffe
nut train-cpu        # Train a neural network to classify MNIST dataset

Your model is training :) You should see something like:


I0426 15:42:52.848568    24 caffe.cpp:219] Starting Optimization
I0426 15:42:52.848577    24 solver.cpp:279] Solving LeNet
I0426 15:42:52.848580    24 solver.cpp:280] Learning Rate Policy: inv
I0426 15:42:52.848839    24 solver.cpp:337] Iteration 0, Testing net (#0)
I0426 15:42:53.850348    24 solver.cpp:404]     Test net output #0: accuracy = 0.1199
I0426 15:42:53.850389    24 solver.cpp:404]     Test net output #1: loss = 2.34175 (* 1 = 2.34175 loss)
I0426 15:42:53.858834    24 solver.cpp:228] Iteration 0, loss = 2.30074
I0426 15:42:53.858851    24 solver.cpp:244]     Train net output #0: loss = 2.30074 (* 1 = 2.30074 loss)
I0426 15:42:53.858875    24 sgd_solver.cpp:106] Iteration 0, lr = 0.01
I0426 15:42:55.300227    24 solver.cpp:228] Iteration 100, loss = 0.252988
I0426 15:42:55.300252    24 solver.cpp:244]     Train net output #0: loss = 0.252988 (* 1 = 0.252988 loss)
I0426 15:42:55.300259    24 sgd_solver.cpp:106] Iteration 100, lr = 0.00992565
I0426 15:42:56.741183    24 solver.cpp:228] Iteration 200, loss = 0.150994
I0426 15:42:56.741207    24 solver.cpp:244]     Train net output #0: loss = 0.150994 (* 1 = 0.150994 loss)
I0426 15:42:56.741214    24 sgd_solver.cpp:106] Iteration 200, lr = 0.00985258
I0426 15:42:58.182211    24 solver.cpp:228] Iteration 300, loss = 0.162226
I0426 15:42:58.182236    24 solver.cpp:244]     Train net output #0: loss = 0.162226 (* 1 = 0.162226 loss)
I0426 15:42:58.182242    24 sgd_solver.cpp:106] Iteration 300, lr = 0.00978075
I0426 15:42:59.624480    24 solver.cpp:228] Iteration 400, loss = 0.0768907
I0426 15:42:59.624507    24 solver.cpp:244]     Train net output #0: loss = 0.0768906 (* 1 = 0.0768906 loss)
I0426 15:42:59.624513    24 sgd_solver.cpp:106] Iteration 400, lr = 0.00971013
I0426 15:43:01.051076    24 solver.cpp:337] Iteration 500, Testing net (#0)
I0426 15:43:02.031441    24 solver.cpp:404]     Test net output #0: accuracy = 0.9735

If you are used to neural networks, you may notice that it is slow, because we are training on CPU. The good news is: on Linux, you can leverage your GPUs in a docker container. And Nut communicates with nvidia-docker-plugin to enable GPUs in your environments. Due to containers limitation, Docker does not support graphic cards neither on OSX, nor on Windows.

To use the GPUs with Nut, you need to:

  1. setup the drivers of your graphic cards
  2. install CUDA (required by Caffe)
  3. install nvidia-docker-plugin
  4. Make sure that nvidia-docker-plugin is running when you call Nut. You can check that curl -s http://0.0.0.0:3476/v1.0/gpu/info displays information about your graphic card.

Then run Nut:

cd examples/caffe    # This folder contains a `nut.yml` file with proper configuration for Caffe
nut download         # Download Caffe source code, and pull docker image (>2Go : it may take some time)
nut build-gpu        # Build Caffe with GPU support
nut train-mnist-gpu  # Train a neural network to classify MNIST dataset, on GPU

download, build-cpu, and train-mnist-cpu are macro defined in nut.yml:

syntax_version: "5"
project_name: caffe
based_on:
  docker_image: matthieudelaro/work-on-caffe  # this image is pulled automatically from Docker Hub
container_working_directory: /opt/caffe
enable_nvidia_devices: true  # enable GPU support. You can leave it even if you don't have a GPU: you will just get an error message.
mount:
  caffe:
  - ./caffe
  - /opt/caffe
  dataset: # if you store your datasets in another folder, add them this way
  - /path/to/dataset
  - /dataset
macros:
  download:
    usage: download caffe
    actions:
    - git clone https://github.com/BVLC/caffe.git .
    - cp Makefile.config.example Makefile.config
  build-cpu:
    usage: build the project in CPU mode only (set CPU_ONLY from Makefile.config on the fly)
    actions:
    - sed -i 's/# CPU_ONLY := 1/CPU_ONLY := 1/' Makefile.config
    - make all -j8
    - echo "/opt/caffe/.build_release/lib/" >> /etc/ld.so.conf.d/caffe-ld-so.conf
    - ldconfig
  train-mnist-cpu:
    usage: attemps to train MNIST in CPU mode only (solver_mode in examples/mnist/lenet_solver.prototxt on the fly)
    actions:
    - "sed -i 's/solver_mode: GPU/solver_mode: CPU/' examples/mnist/lenet_solver.prototxt"
    - ./data/mnist/get_mnist.sh
    - ./examples/mnist/create_mnist.sh
    - caffe train --solver=examples/mnist/lenet_solver.prototxt
  # other macros
  # ...

If you run nut in examples/caffe, you will notice that other macros have been defined:

NAME:
   nut - the development environment, containerized

USAGE:
   nut [global options] command [command options] [arguments...]
   
VERSION:
   0.1.0 dev
   
COMMANDS:
   build            macro: build the project
   build-cpu        macro: build the project in CPU mode only (set CPU_ONLY from Makefile.config on the fly)
   build-gpu        macro: build the project in GPU mode (unset CPU_ONLY from Makefile.config on the fly)
   build-pycaffe    macro: build pycaffe
   download         macro: download caffe
   test             macro: run the tests
   test-cpu         macro: run the tests in CPU mode only (set CPU_ONLY from Makefile.config on the fly)
   test-gpu         macro: run the tests in GPU mode (unset CPU_ONLY from Makefile.config on the fly)
   train-mnist-cpu  macro: attemps to train MNIST in CPU mode only (solver_mode in examples/mnist/lenet_solver.prototxt on the fly)
   train-mnist-gpu  macro: attemps to train MNIST in GPU mode (solver_mode in examples/mnist/lenet_solver.prototxt on the fly)

So you may want to run nut test-gpu to test GPU support. You can also define a new macro train to train your own model.

Thanks to containers, you installed Caffe and all of its dependencies without messing up with the configuration of your computer. The docker image takes some space on your hard drive though. Good news is: if you don't need Caffe daily, you can simply remove it with docker rmi matthieudelaro/work-on-caffe. And next time that you use Nut for this project, the Docker image will be pulled from Docker Hub automatically.

Clone this wiki locally