OpenFrameworks addon for Google's graph based machine intelligence / deep learning library TensorFlow.
This update includes the newly released TensorFlow r1.1 and has been tested with openFrameworks 0.9.8.
I provide precompiled libraries for Linux and OSX (though OSX might lag a little bit behind as I don't have regular access). For linux there are both GPU and CPU-only libs, while OSX is CPU-only. I haven't touched Windows yet as building from sources is 'experimental' (and doing Linux and OSX was painful enough).
You can find instructions and more information in the wiki, particularly for Getting Started.
TensorFlow is written in C/C++ with python bindings, and most of the documentation and examples are for python. This addon wraps the C/C++ backend (and a little bit of the new C++ FrontEnd) with a number of examples. The basic idea is:
- Build and train graphs (i.e. 'networks', 'models') mostly in python (possibly Java, C++ or any other language/platform with tensorflow bindings)
- Save the trained models to binary files
- Load the trained models in openframeworks, feed data, manipulate, get results, play, and connect to the ofUniverse
You could potentially do steps 1-2 in openframeworks as well, but the python API is more user-friendly for building graphs and training.
The examples are quite minimal and shouldn't be considered comprehensive tensorflow tutorials. They demonstrate loading and manipulating different types of tensorflow models in openFrameworks. E.g.
- for the most basic example of loading a model, feeding it data and fetching the results (using just a low level C API), see example-basic
- for a very simple barebones Image-to-Image example (loading a model, feeding it an image, and fetching an image using a higher level C++ API) see example-pix2pix-simple - This is probably the best minimal template for other examples
- for more complex Image-to-Image examples (with Conditional Generative Adversarial Networks) see example-pix2pix or example-pix2pix-webcam
- for style transfer see example-style-transfer
- for image classification see example-mnist or example-inception3
- for sequence generation of discrete data such as text (with stateful LSTM/RNN, where LSTM state is retrieved and passed back in at every time-step) see example-char-rnn
- for sequence generation of continuous data such as handwriting (with Recurrent Mixture Density Networks) see example-handwriting-rnn
- for image generation (with Conditional Generative Adversarial Networks) see example-pix2pix or example-pix2pix-webcam
- for constructing graphs in C++ see example-build-graph
Potentially you could load any pretrained model in openframeworks and manipulate. E.g. checkout Parag's tutorials and Kadenze course. There's info in the wiki on how to do export and distribute models.
(Note: the animations below are animated-gifs, hence the low color count, dithering, and low framerate)
Same as pix2pix-example with the addition of live webcam input. See description of pix2pix-example for more info on pix2pix and the models I provide. I'm using a very simple and ghetto method of transforming the webcam input into the desired colour palette before feeding into the model. See the code for more info on this.
pix2pix (Image-to-Image Translation with Conditional Adversarial Nets). An accessible explanation can be found here and here. The network basically learns to map from one image to another. E.g. in the example you draw in the left viewport, and it generates the image in the right viewport. I'm supplying three pretrained models from the original paper: cityscapes, building facades, and maps. And a model I trained on 150 art collections from around the world. Models are trained and saved in python with this code (which is based on this tensorflow implementation, which is based on the original torch implementation), and loaded in openframeworks for prediction.
This is the simplest pix2pix example with no interaction. The purpose of this example is the show the most barebones way of using the msa::tf::SimpleModel API
Fast Style Transfer from Logan Engstrom. This realtime webcam openFrameworks example is by Ole Kristensen who also modified the python evaluate.py script to export a graph in protobuf format for use with the c++ TF implementation. Ole has a fork of Enstroms repo, that will do the ugly varhack tricks to restore the graph variables for you. Note that when you want to use your own models you have to evaluate (style) one image of the same resolution as the one you want to feed in your openFrameworks app. You do this for evaluate.py to export an of.pb file for you to load from your ofApp.
@misc{engstrom2016faststyletransfer,
author = {Logan Engstrom},
title = {Fast Style Transfer},
year = {2016},
howpublished = {\url{https://github.com/lengstrom/fast-style-transfer/}},
note = {commit xxxxxxx}
}
Generative handwriting with Long Short-Term Memory (LSTM) Recurrent Mixture Density Network (RMDN), ala Graves2013. Brilliant tutorial on inner workings here, which also provides the base for the training code (also see javscript port and tutorial here). Models are trained and saved in python with this code, and loaded in openframeworks for prediction. Given a sequence of points, the model predicts the position for the next point and pen-up probability. I'm supplying a model pretrained on the IAM online handwriting dataset. Note that this demo does not do handwriting synthesis, i.e. text to handwriting ala Graves' original demo. It just does asemic handwriting, producing squiggles that are statistically similar to the training data, e.g. same kinds of slants, curvatures, sharpnesses etc., but not nessecarily legible. There is an implementation (and great tutorial) of synthesis using attention here, which I am also currently converting to work in openframeworks. This attention-based synthesis implementation is also based on Graves2013, which I highly recommend to anyone really interested in understanding generative RNNs.
Generative character based Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) demo, ala Karpathy's char-rnn and Graves2013. Models are trained and saved in python with this code and loaded in openframeworks for prediction. I'm supplying a bunch of models (bible, cooking, erotic, linux, love songs, shakespeare, trump), and while the text is being generated character by character (at 60fps!) you can switch models in realtime mid-sentence or mid-word. (Drop more trained models into the folder and they'll be loaded too). Typing on the keyboard also primes the system, so it'll try and complete based on what you type. This is a simplified version of what I explain here, where models can be mixed as well. (Note, all models are trained really quickly with no hyperparameter search or cross validation, using default architecture of 2 layer LSTM of size 128 with no dropout or any other regularisation. So they're not great. A bit of hyperparameter tuning would give much better results - but note that would be done in python. The openframeworks code won't change at all, it'll just load the better model).
MNIST (digit) clasffication with two different models - shallow and deep. Both models are built and trained in python (py src in bin/py folder). Openframeworks loads the trained models, allows you to draw with your mouse, and tries to classify your drawing. Toggle between the two models with the 'm' key.
Single layer softmax regression: Very simple multinomial logistic regression. Quick'n'easy but not very good. Trains in seconds. Accuracy on test set ~90%. Implementation of https://www.tensorflow.org/versions/0.6.0/tutorials/mnist/beginners/index.html
Deep(ish) Convolutional Neural Network: Basic convolutional neural network. Very similar to LeNet. Conv layers, maxpools, RELU's etc. Slower and heavier than above, but much better. Trains in a few minutes (on CPU). Accuracy 99.2% Implementation of https://www.tensorflow.org/versions/0.6.0/tutorials/mnist/pros/index.html#build-a-multilayer-convolutional-network
Openframeworks implementation for image recognition using Google's 'Inception-v3' architecture network, pre-trained on ImageNet. Background info at https://www.tensorflow.org/versions/0.6.0/tutorials/image_recognition/index.html
Just some unit tests. Very boring for most humans. Possibly exciting for computers (or humans that get excited at the thought of computers going wrong).
Simplest example possible. A very simple graph that multiples two numbers is built in python and saved. The openframeworks example loads the graph, and feeds it mouse coordinates. 100s of lines of code, just to build a simple multiplication function.
Builds a simple graph from scratch directly in openframeworks using the C++ API without any python. Really not very exciting to look at, more of a syntax demo than anything. Based on https://www.tensorflow.org/api_guides/cc/guide