Resource for understanding sources of uncertainty in smooth_estimates()
#110
-
Is there a reference or blog post out there to help me better understand what the |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
Ok, reading Marra and Wood 2006, and I think I understand what problem |
Beta Was this translation helpful? Give feedback.
-
You (now) have the relevant reference for Importantly, this is all done post hoc by processing the model rather than something done a priori before fitting; the usual identifiability constraints are still used when fitting. For All of this and more is briefly reviewed in Simon's recent (2020) invited paper, to which there is also a discussion. |
Beta Was this translation helpful? Give feedback.
-
For your application, you might want to consider simultaneous intervals instead of the Bayesian credible intervals. In some senses the Bayesian credible intervals are still pointwise (although they have an average across the function interpretation rather than true pointwise interpretation when viewed in this frequentist manner). This means that the coverage at a single point is approximately correct, but as you start to look at more points, as you plan to do, the coverage over that set of points (or the whole function) ends up being a lot less when combined than the notional coverage you used for each single point (0.95 say). A simultaneous interval is one where the interval (we hope) covers 0.95 (say) of all possible smooths give the estimates of the smooth and it's uncertainty. There's some background on the idea in Simpson (2018) and general code in the supplements to do this for smooths. {gratia} contains a more recent implementation, with the |
Beta Was this translation helpful? Give feedback.
You (now) have the relevant reference for
overall_uncertainty
. More generally, when the bias in the estimated smooth is large relative to something (the variance? I forget now) then the theory of Nychka (1998), which is expanded upon by Marra and Wood, breaks down. So any situation where there is appreciable bias in the estimated smooth leads to poor coverage in the credible intervals. Such a situation would be where you have oversmoother, such as where you failed to set the basis dimension sufficiently large so that the basis might feasibly contain the true function or a close approximation to it. Another issue this tackles is the bow-tie intervals when smoothness selection penalizes the…