You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been reading the book and strictly following the code examples. But I think there's something wrong with the code in chapter 10, when training a model using CNN to recognize the MNIST images. In the last part of the code when updating the weights:
I'm gently surprised because according to what I have previously learned in the book, the layer_x_deltas should be calculating the negetive derivatives of the loss functions, so with the last line, I think it should be
kernels+=alpha*k_update
After modifying this, I try it on my own computer. The output:
After modifying, the accuracy of the training set increases much rapidly than with the original "-=". However, it puzzles me that after 300 times of iteration, both models get an accuracy about 86%. So what's the difference? Does the code have a typo or I just simply have misunderstood it?
I posted a question about this on stackoverflow. I have not typed the code wrongly. So what's wrong?
The text was updated successfully, but these errors were encountered:
I've been reading the book and strictly following the code examples. But I think there's something wrong with the code in chapter 10, when training a model using CNN to recognize the MNIST images. In the last part of the code when updating the weights:
I'm gently surprised because according to what I have previously learned in the book, the layer_x_deltas should be calculating the negetive derivatives of the loss functions, so with the last line, I think it should be
After modifying this, I try it on my own computer. The output:
But with the original code, I get:
After modifying, the accuracy of the training set increases much rapidly than with the original "-=". However, it puzzles me that after 300 times of iteration, both models get an accuracy about 86%. So what's the difference? Does the code have a typo or I just simply have misunderstood it?
I posted a question about this on stackoverflow. I have not typed the code wrongly. So what's wrong?
The text was updated successfully, but these errors were encountered: