Recently I am reading Probabilistic Deep Learning with Python, which is a really interesting book. I am really new to deep learning. I have some questions about the following plot after reading the code and results in nb_ch05_02.ipynb. Is it typical for this type of regression analysis for the training loss to be much larger than the validation loss, and can we trust this model with such training results?
I got a similar plot in my own research (validation loss curve is below training curve with a large gap), and I'm not sure how to explain or evaluate the model. I will greatly appreciate it if someone could solve my puzzles.
