layout | permalink | title |
---|---|---|
splash |
binary-impl:output_ext |
Implementations for binary classifiers |
This page explains various ways of implementing single-layer and multi-layer neural networks as a supplementary material of this lecture. The implementations appear in explicit to abstract order so that one can understand the black-boxed internal processing in deep learning frameworks.
In order to focus on the internals, this page uses a simple and classic example: threshold logic units.
Supposing
AND | OR | NAND | XOR | ||
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 1 | 0 |
0 | 1 | 0 | 1 | 1 | 1 |
1 | 0 | 0 | 1 | 1 | 1 |
1 | 1 | 1 | 1 | 0 | 0 |
We consider a single layer perceptron that predicts a binary label
Here,
Let's train a NAND gate with two inputs (
We convert the truth table into a training set consisting of all mappings of the NAND gate,
In order to train a weight vector and bias weight in a unified code, we include a bias term as an additional dimension to inputs. More concretely, we append
Then, the formula of the single-layer perceptron becomes,
In other words,
The code below implements Rosenblatt's perceptron algorithm with a fixed number of iterations (100 times). We use a constant learning rate 0.5 for simplicity.
{% include notebook/binary/slp_rosenblatt.md %}
It is better to reduce the execusion run by the Python interpreter, which is relatively slow. The common technique to speed up a machine-learning code written in Python is to to execute computations within the matrix library (e.g., numpy).
The single-layer perceptron makes predictions for four inputs,
Here, we define
Then, we can write the four predictions in one dot-product computation, $$ \hat{Y} = X \cdot \boldsymbol{w} $$
The code below implements this idea. The function np.heaviside()
yields a vector corresponding to the four predictions, applying the step function for every element of the argument.
This technique is frequently used in mini-batch training, where gradients for a small number (e.g., 4 to 128) of instances are computed.
{% include notebook/binary/slp_rosenblatt_batch.md %}
Next, we consider a single-layer feedforward neural network with sigmoid activation function.
In essence, we replace Heaviside step function with sigmoid function when predicting
{% include notebook/binary/slp_sgd_numpy.md %}
{% include notebook/binary/ad_autograd.md %}
{% include notebook/binary/ad_pytorch.md %}
{% include notebook/binary/ad_chainer.md %}
{% include notebook/binary/ad_tensorflow.md %}
{% include notebook/binary/ad_mxnet.md %}
{% include code.html tt1="PyTorch" tc1="notebook/binary/slp_ad_pytorch.md" tt2="Chainer" tc2="notebook/binary/slp_ad_chainer.md" tt3="TensorFlow" tc3="notebook/binary/slp_ad_tensorflow.md" tt4="MXNet" tc4="notebook/binary/slp_ad_mxnet.md" %}
{% include code.html tt1="PyTorch" tc1="notebook/binary/mlp_ad_pytorch.md" tt2="Chainer" tc2="notebook/binary/mlp_ad_chainer.md" tt3="TensorFlow" tc3="notebook/binary/mlp_ad_tensorflow.md" tt4="MXNet" tc4="notebook/binary/mlp_ad_mxnet.md" %}
{% include code.html tt1="PyTorch" tc1="notebook/binary/slp_pytorch_sequential.md" tt2="Chainer" tc2="notebook/binary/slp_chainer_sequential.md" %}
{% include code.html tt1="PyTorch" tc1="notebook/binary/mlp_pytorch_sequential.md" tt2="Chainer" tc2="notebook/binary/mlp_chainer_sequential.md" %}
{% include code.html tt1="PyTorch" tc1="notebook/binary/slp_pytorch_sequential_optim.md" tt2="Chainer" tc2="notebook/binary/slp_chainer_sequential_optimizers.md" tt3="TensorFlow" tc3="notebook/binary/slp_tensorflow_keras.md" tt4="MXNet" tc4="notebook/binary/slp_mxnet_sequential_trainer.md" %}
{% include code.html tt1="PyTorch" tc1="notebook/binary/mlp_pytorch_sequential_optim.md" tt2="Chainer" tc2="notebook/binary/mlp_chainer_sequential_optimizers.md" tt3="TensorFlow" tc3="notebook/binary/mlp_tensorflow_keras.md" tt4="MXNet" tc4="notebook/binary/mlp_mxnet_trainer.md" %}
{% include code.html tt1="PyTorch" tc1="notebook/binary/slp_pytorch_class.md" tt2="Chainer" tc2="notebook/binary/slp_chainer_class.md" %}
{% include code.html tt1="PyTorch" tc1="notebook/binary/mlp_pytorch_class.md" tt2="Chainer" tc2="notebook/binary/mlp_chainer_class.md" %}
<script src="https://code.jquery.com/jquery-3.3.1.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/highlight.min.js"></script> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]} }); </script> <script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML' async></script> <script> $(document).ready(function() { $('pre code[class="language-python"]').each(function(i, block) { hljs.highlightBlock(block); }); }); </script>