You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I checked the gradients you derived against the numerical gradients, and your implementation does not match. It looks like the error is in two places:
In calculate_loss, you average the total loss (including the regularization term) over the data batch. The correct implementation should average only the log loss, but not the regularization term.
In build_model, the gradients (dW1, dW2, db1, db2) during backprop should be averaged over the data batch. Again, the correct implementation should not include the regularization terms in the average over the data batch.
The text was updated successfully, but these errors were encountered:
I checked the gradients you derived against the numerical gradients, and your implementation does not match. It looks like the error is in two places:
In
calculate_loss
, you average the total loss (including the regularization term) over the data batch. The correct implementation should average only the log loss, but not the regularization term.In
build_model
, the gradients (dW1, dW2, db1, db2
) during backprop should be averaged over the data batch. Again, the correct implementation should not include the regularization terms in the average over the data batch.The text was updated successfully, but these errors were encountered: