Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spatial and spatiotemporal model evaluation #231

Open
sigmafelix opened this issue Aug 25, 2023 · 1 comment
Open

Spatial and spatiotemporal model evaluation #231

sigmafelix opened this issue Aug 25, 2023 · 1 comment
Assignees

Comments

@sigmafelix
Copy link
Collaborator

sigmafelix commented Aug 25, 2023

This issue is about spatial and spatiotemporal cross-validation for model performance evaluation, and is possibly related to ropensci/chopin#1 .

Paper

  • Area of Applicability (AOA, Meyer and Pebesma 2021)
    • New observations ($\tilde{X}$) are assessed based on the lowest Euclidean distance:
      $$d(\tilde{X}_{j \cdot}, X_{i \cdot})$$
      from the existing training data (X) on the multivariate predictor space, say,
      $$d_{j} = \arg \min_{k} d(\tilde{X}_{j\cdot}, X_{k\cdot})$$
    • Dissimilarity index is assessed based on the formula
      $$\text{DI}_{j} = d_{j} / \bar{d}$$
      where $\bar{d}$ is the average pairwise distance from the training data
      • With a user-defined threshold, the user can identify the new observations that are not suitable for applying the model (or gives low expectation on finding good prediction accuracy)
    • Questions
      • Does AOA unintentionally prefer the spatiotemporally closer features to others, especially when the high spatial and spatiotemporal autocorrelation exists?
      • What if one employs other distance metrics?
      • Quite surely it is computationally demanding O(n^2) as AOA relies on full distance matrix of the training data -> how to make it scalable?
        • Further question: how to assess spatial or spatiotemporal autocorrelation efficiently with the sizable datasets?

R packages

🔍
  • CAST: spatiotemporal extension of caret. By Meyer, the author of AOA paper.
  • waywiser: spatial model evaluation in tidymodels ecosystem
  • targets: workflow management tool in R. Provides code examples for HPC utilization.
@sigmafelix
Copy link
Collaborator Author

Spatial/temporal/spatiotemporal splitting functions for cross-validation are implemented in NRTAPmodel. I think these functions are also useful for this package either.

@sigmafelix sigmafelix transferred this issue from ropensci/chopin Dec 15, 2023
@sigmafelix sigmafelix self-assigned this Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant