diff --git a/README.Rmd b/README.Rmd index 89618e08..706d5e4e 100644 --- a/README.Rmd +++ b/README.Rmd @@ -323,7 +323,7 @@ plot(lynx_mvgam, type = 'residuals') ``` ## Comparing models based on forecasts -Another useful utility of `mvgam` is the ability to use leave-future-out comparisons. Time series models are often evaluated using an expanding window training technique, where the model is initially trained on some subset of data from `t = 1` to `t = n_train`, and then is used to produce forecasts for the next `fc_horizon` time steps `t = n_train + fc_horizon`. In the next iteration, the size of training data is expanded by a single time point and the process repeated. This is obviously computationally challenging for Bayesian time series models, as the number of refits can be very large. `mvgam` uses an approximation based on importance sampling. Briefly, we refit the model using the first `min_t` observations to perform a single exact `fc_horizon`-ahead forecast step. This forecast is evaluated against the `min_t + fc_horizon` out of sample observations using the Expected Log Predictive Density (ELPD). Next, we approximate each successive round of expanding window forecasts by moving forward one step at a time `for i in 1:N_evaluations` and re-weighting draws from the model's posterior predictive distribution using Pareto Smoothed Importance Sampling (PSIS). In each iteration `i`, PSIS weights are obtained for all observations that would have been included in the model if we had re-fit. If these importance ratios are stable, we consider the approximation adequate and use the re-weighted posterior's forecast for evaluating the next holdout set of testing observations (`(min_t + i + 1):(min_t + i + fc_horizon)`). This is similar to the process of particle filtering to update forecasts in light of new data by re-weighting the posterior draws using importance weights. But at some point the importance ratio variability will become too large and importance sampling will be unreliable. This is indicated by the estimated shape parameter `k` of the generalized Pareto distribution crossing a certain threshold `pareto_k_threshold`. Only then do we refit the model using all of the observations up to the time of the failure. We then restart the process and iterate forward until the next refit is triggered. The process is computationally much more efficient, as only a fraction of the evaluations typically requires refits (the algorithm isdescribed in detail by Bürkner et al. 2020). +Another useful utility of `mvgam` is the ability to use leave-future-out comparisons. Time series models are often evaluated using an expanding window training technique, where the model is initially trained on some subset of data from `t = 1` to `t = n_train`, and then is used to produce forecasts for the next `fc_horizon` time steps `t = n_train + fc_horizon`. In the next iteration, the size of training data is expanded by a single time point and the process repeated. This is obviously computationally challenging for Bayesian time series models, as the number of refits can be very large. `mvgam` uses an approximation based on importance sampling. Briefly, we refit the model using the first `min_t` observations to perform a single exact `fc_horizon`-ahead forecast step. This forecast is evaluated against the `min_t + fc_horizon` out of sample observations using the Expected Log Predictive Density (ELPD). Next, we approximate each successive round of expanding window forecasts by moving forward one step at a time `for i in 1:N_evaluations` and re-weighting draws from the model's posterior predictive distribution using Pareto Smoothed Importance Sampling (PSIS). In each iteration `i`, PSIS weights are obtained for all observations that would have been included in the model if we had re-fit. If these importance ratios are stable, we consider the approximation adequate and use the re-weighted posterior's forecast for evaluating the next holdout set of testing observations (`(min_t + i + 1):(min_t + i + fc_horizon)`). This is similar to the process of particle filtering to update forecasts in light of new data by re-weighting the posterior draws using importance weights. But at some point the importance ratio variability will become too large and importance sampling will be unreliable. This is indicated by the estimated shape parameter `k` of the generalized Pareto distribution crossing a certain threshold `pareto_k_threshold`. Only then do we refit the model using all of the observations up to the time of the failure. We then restart the process and iterate forward until the next refit is triggered. The process is computationally much more efficient, as only a fraction of the evaluations typically requires refits (the algorithm isdescribed in detail by Bürkner et al. 2020). Paul-Christian Bürkner, Jonah Gabry & Aki Vehtari (2020). Approximate leave-future-out cross-validation for Bayesian time series models. Journal of Statistical Computation and Simulation. 90:14, 2499-2523.