Dummy example: original and reconstructed plot #5

nash5202 · 2021-06-29T12:47:45Z

Many thanks for making the SDAE implementation available.
I am trying to run the implementation for the dummy example that you have provided (sdae_test.py) for MNIST data. The run itself doesn't give any errors but the original and reconstructed plot I am getting for the 50 training samples is not quite right. Please see the plot below:

Following are the parameter specifications that I had used and the rest of them are default:
cur_sdae = StackedDenoisingAE(n_layers = 3, n_hid = [400], dropout = [0.1], nb_epoch = 5, batch_size = 32)
model, (dense_train, dense_val, dense_test), recon_mse = cur_sdae.get_pretrained_sda(X_train, X_test, X_test, dir_out = '~\output\')

I have the following library and python specifications running:
python: 3.7.10
keras: 2.4.3
tensorflow: 2.3.0
numpy: 1.20.2

Also, the run is as follows:
Training layer 0
Epoch 1/5
60000/60000 [==============================] - 932s 16ms/step - loss: 0.0082 - val_loss: 0.0044
Epoch 2/5
60000/60000 [==============================] - 928s 15ms/step - loss: 0.0061 - val_loss: 0.0042==>.................] - ETA: 8:01 - loss: 0.006355445/60000 [==========================>...] - ETA: 1:06 - loss: 0.0062
Epoch 3/5
60000/60000 [==============================] - 930s 16ms/step - loss: 0.0056 - val_loss: 0.0042==================>.] - ETA: 23s - loss: 0.0056
Layer 0 has been trained
Training layer 1
Epoch 1/5
60000/60000 [==============================] - 25018s 417ms/step - loss: 0.0014 - val_loss: 5.8959e-04
Epoch 2/5
60000/60000 [==============================] - 398s 7ms/step - loss: 8.6793e-04 - val_loss: 5.4565e-04
Epoch 3/5
60000/60000 [==============================] - 368s 6ms/step - loss: 6.9736e-04 - val_loss: 4.9680e-04
Epoch 4/5
60000/60000 [==============================] - 30895s 515ms/step - loss: 6.1303e-04 - val_loss: 4.3976e-04
Epoch 5/5
60000/60000 [==============================] - 389s 6ms/step - loss: 5.6251e-04 - val_loss: 4.7602e-04
Layer 1 has been trained
Training layer 2
Epoch 1/5
60000/60000 [==============================] - 378s 6ms/step - loss: 2.9044e-04 - val_loss: 9.3741e-05
Epoch 2/5
60000/60000 [==============================] - 315s 5ms/step - loss: 1.3853e-04 - val_loss: 7.5291e-05
Epoch 3/5
60000/60000 [==============================] - 312s 5ms/step - loss: 1.2457e-04 - val_loss: 7.1846e-05
Epoch 4/5
60000/60000 [==============================] - 317s 5ms/step - loss: 1.1731e-04 - val_loss: 6.7766e-05
Epoch 5/5
60000/60000 [==============================] - 317s 5ms/step - loss: 1.1249e-04 - val_loss: 6.7170e-05
Layer 2 has been trained
Model: "sequential_3"

Layer (type) Output Shape Param #

dropout_14 (Dropout) (None, 784) 0

encoder0 (Dense) (None, 400) 314000

encoder1 (Dense) (None, 400) 160400

encoder2 (Dense) (None, 400) 160400

Total params: 634,800
Trainable params: 634,800
Non-trainable params: 0

It is apparent from the run that the loss is small and decreasing. I am new to deep learning implementation and have tried to look through the code in sdae.py and sdae_test.py but things seem to be okay. I couldn't figure out what is going wrong. Is there some issue with the sdae.py that I am missing? Any help is appreciated. Thanks in advance for your response.

nash5202 · 2021-07-01T10:14:27Z

Following on the above issue, I have noticed that after the training the encoder_layer is assigned the 2nd last layer in the get_pretrained_sda (line 139) :

encoder_layer = cur_model.layers[-2]

and in the function call for getting the dense representation of the last layer (line 146):

data_in = self._get_intermediate_output(cur_model, data_in, n_layer = 2, train = 0, n_out = self.n_hid[cur_layer], batch_size = self.batch_size)

n_layer is fixed to be 2. From the comment next to it, it is apparent that it is to get output of the second layer (hidden layer). However, when I try to change the value of n_layer to the last layer (n_layer =-1), the run fails and it shows up an error as below:

**data_in = self._get_intermediate_output(cur_model, data_in, n_layer = -1, train = 0, n_out = self.n_hid[cur_layer], batch_size = self.batch_size) #train = 0 because we do not want to use dropout to get hidden node value, since is a train-only behavior, used only to learn weights. output of second layer: hidden layer

line 332, in _get_intermediate_output
data_out[cur_batch_idx,:] = self._get_nth_layer_output(model, n_layer, X = cur_batch, train = train)

ValueError: shape mismatch: value array of shape (32,784) could not be broadcast to indexing result of shape (32,400)**

Can you please help me understand why these two: encoder layer set to cur_model.layers[-2] and n_layer = 2, set like this? It would be highly helpful in getting me to use the implementation.
Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dummy example: original and reconstructed plot #5

Dummy example: original and reconstructed plot #5

nash5202 commented Jun 29, 2021

nash5202 commented Jul 1, 2021

Dummy example: original and reconstructed plot #5

Dummy example: original and reconstructed plot #5

Comments

nash5202 commented Jun 29, 2021

Layer (type) Output Shape Param #

encoder2 (Dense) (None, 400) 160400

nash5202 commented Jul 1, 2021