spiral_main.py
loads the training data from
spirals.csv
,
applies the specified model and produces a graph of the resulting function,
along with the data.
For this task there is no test set as such,
but we instead judge the generalization
by plotting the function computed by the network
and making a visual assessment.
-
Provide code for a Pytorch Module called
PolarNet
which operates as follows: First, the input(x,y)
is converted to polar co-ordinates(r,a)
withr=sqrt(x*x + y*y)
,a=atan2(y,x)
. Next,(r,a)
is fed into a fully connected neural network with one hidden layer usingtanh
activation, followed by a single output usingsigmoid
activation. The conversion to polar coordinates should be included in yourforward()
method, so that the Module performs the entire task of conversion followed by network layers. - [1 mark]
Run the code by typing
python3 spiral_main.py --net polar --hid 10
Try to find the minimum number of hidden nodes required so that this PolarNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. Thegraph_output()
method will generate a picture of the function computed by your PolarNet calledpolar_out.png
, which you should include in your report. -
Provide code for a Pytorch Module called
RawNet
which operates on the raw input(x,y)
without converting to polar coordinates. Your network should consist of two fully connected hidden layers with tanh activation, plus the output layer, with sigmoid activation. You should not useSequential
but should instead build the network from individual components as shown in the programxor.py
from Exercises 5 (repeated in slide 4 of lecture slides 3b on PyTorch). The number of neurons in both hidden layers should be determined by the parameternum_hid
. -
Run the code by typing
python3 spiral_main.py --net raw
Keeping the number of hidden nodes in each layer fixed at 10, try to find a value for the size of the initial weights (--init) such that this RawNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. Include in your report the number of hidden nodes, and the values of any other metaparameters. Thegraph_output()
method will generate a picture of the function computed by your RawNet calledraw_out.png
, which you should include in your report. -
Provide code for a Pytorch Module called
ShortNet
which again operates on the raw input(x,y)
without converting to polar coordinates. This network should again consist of two hidden layers (with tanh activation) plus the output layer (with sigmoid activation), but this time should include short-cut connections between every pair of layers (input, hid1, hid2
andoutput
) as depicted on slide 10 of lecture slides 3a on Hidden Unit Dynamics. Note, however that this diagram shows only two hidden nodes in each layer, which is not enough to learn the task; in your code the number of neurons in both hidden layers should be determined by the parameternum_hid
. -
Run the code by typing
python3 spiral_main.py --net short
You should experiment to find a good value for the initial weight size, and try to find the mininum number of hidden nodes per layer so that this ShortNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. Include in your report the number of hidden nodes per layer, as well as the initial weight size and any other metaparameters. Thegraph_output()
method will generate a picture of the function computed by your ShortNet calledshort_out.png
, which you should include in your report. -
Using
graph_output()
as a guide, write a method calledgraph_hidden(net, layer, node)
which plots the activation (after applying thetanh
function) of the hidden node with the specified number(node)
in the specifiedlayer
(1 or 2). (Note: ifnet
is of typePolarNet
,graph_output()
only needs to behave correctly when layer is 1).Hint: you might need to modify
forward()
so that the hidden unit activations are retained, i.e. replacehid1 = torch.tanh(...)
withself.hid1 = torch.tanh(...)
Use this code to generate plots of all the hidden nodes in PolarNet, and all the hidden nodes in both layers of RawNet and ShortNet, and include them in your report.