Skip to content

Commit

Permalink
Merge pull request #1 from Real-Slin-Shady/main
Browse files Browse the repository at this point in the history
First version of AGDS book as quarto
  • Loading branch information
stineb authored Oct 10, 2024
2 parents c7a9a4d + 2f2b8af commit c8e0467
Show file tree
Hide file tree
Showing 184 changed files with 170,432 additions and 25 deletions.
85 changes: 85 additions & 0 deletions book/R/eval_model.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
#' Evaluate model performance
#'
#' @param mod the model formulation
#' @param df_train the training data
#' @param df_test the testing data
#' @param return_metrics the metrics / figures returned
#'
#' @return model metrics

eval_model <- function(mod, df_train, df_test, return_metrics = FALSE){


# Require caret to avoid errors if a caret-based model is entered without having caret loaded globally
if (!require("caret", character.only = TRUE)) {
stop("Error: Could not load the library {caret}. Please install it and try again.")
}

# Require magrittr because script uses %>% pipe
if (!require("magrittr", character.only = TRUE)) {
stop("Error: Could not load the library {caret}. Please install it and try again.")
}

# add predictions to the data frames----
df_train <- df_train %>%
drop_na() %>% # magrittr pipe necessary here for the dot ('.') evaluation
mutate(fitted = predict(mod, newdata = .))

df_test <- df_test %>%
drop_na() %>%
mutate(fitted = predict(mod, newdata = .))

# get metrics tables----
metrics_train <- df_train %>%
yardstick::metrics(GPP_NT_VUT_REF, fitted)

metrics_test <- df_test %>%
yardstick::metrics(GPP_NT_VUT_REF, fitted)

if (return_metrics){

return(list(train = metrics_train, test = metrics_test))

} else {
# extract values from metrics tables----
rmse_train <- metrics_train %>%
filter(.metric == "rmse") %>%
pull(.estimate)
rsq_train <- metrics_train %>%
filter(.metric == "rsq") %>%
pull(.estimate)

rmse_test <- metrics_test %>%
filter(.metric == "rmse") %>%
pull(.estimate)
rsq_test <- metrics_test %>%
filter(.metric == "rsq") %>%
pull(.estimate)

# visualise as a scatterplot----
# adding information of metrics as sub-titles
gg1 <- df_train %>%
ggplot(aes(GPP_NT_VUT_REF, fitted)) +
geom_point(alpha = 0.3) +
geom_smooth(method = "lm", se = FALSE, color = "red") +
geom_abline(slope = 1, intercept = 0, linetype = "dotted") +
labs(subtitle = bquote( italic(R)^2 == .(format(rsq_train, digits = 2)) ~~
RMSE == .(format(rmse_train, digits = 3))),
title = "Training set") +
theme_classic()

gg2 <- df_test %>%
ggplot(aes(GPP_NT_VUT_REF, fitted)) +
geom_point(alpha = 0.3) +
geom_smooth(method = "lm", se = FALSE, color = "red") +
geom_abline(slope = 1, intercept = 0, linetype = "dotted") +
labs(subtitle = bquote( italic(R)^2 == .(format(rsq_test, digits = 2)) ~~
RMSE == .(format(rmse_test, digits = 3))),
title = "Test set") +
theme_classic()

out <- cowplot::plot_grid(gg1, gg2)

return(out)
}
}
15 changes: 15 additions & 0 deletions book/_freeze/basicr/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"hash": "736019d7d4a1cfb43ff99ddd232a520d",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Basic R\"\n---\n\n::: {.cell}\n\n:::\n\n\n### Basic expressions\n\nAn expression is a set of commands that returns a value.\n\nClick `Run Code` to run the following R code.\n\nExecute this simple calculation.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n50 * 2.2\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 110\n```\n\n\n:::\n:::\n\n\nShow the first rows of a table.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nhead(mtcars)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n mpg cyl disp hp drat wt qsec vs am gear carb\nMazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4\nMazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4\nDatsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1\nHornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1\nHornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2\nValiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1\n```\n\n\n:::\n:::\n\n\n::: {.callout-note}\n## Exercise\n\nShow the last rows of the `mtcars` table.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Add you code here\n```\n:::\n\n\nYour result should look like the plot below\n:::\n\n::: {.callout-tip collapse=\"true\"}\n## Expected Result\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n mpg cyl disp hp drat wt qsec vs am gear carb\nPorsche 914-2 26.0 4 120.3 91 4.43 2.140 16.7 0 1 5 2\nLotus Europa 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2\nFord Pantera L 15.8 8 351.0 264 4.22 3.170 14.5 0 1 5 4\nFerrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6\nMaserati Bora 15.0 8 301.0 335 3.54 3.570 14.6 0 1 5 8\nVolvo 142E 21.4 4 121.0 109 4.11 2.780 18.6 1 1 4 2\n```\n\n\n:::\n:::\n\n:::\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
17 changes: 17 additions & 0 deletions book/_freeze/code_management/execute-results/html.json

Large diffs are not rendered by default.

17 changes: 17 additions & 0 deletions book/_freeze/data_variety/execute-results/html.json

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions book/_freeze/data_vis/execute-results/html.json

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions book/_freeze/data_wrangling/execute-results/html.json

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
15 changes: 15 additions & 0 deletions book/_freeze/getting_started/execute-results/html.json

Large diffs are not rendered by default.

17 changes: 17 additions & 0 deletions book/_freeze/ggplot/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"hash": "8ed1a7a5d037e11d62f349b6d7212984",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"ggplot demo\"\neditor: visual\nengine: knitr\n---\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(ggplot2)\n\nggplot(data = mtcars, mapping = aes(x = mpg, y = hp)) + \n geom_point()\n```\n\n::: {.cell-output-display}\n![](ggplot_files/figure-html/unnamed-chunk-1-1.png){width=672}\n:::\n:::\n\n\n<!-- <html> -->\n\n<!-- ```{=html} -->\n<!-- <script type=\"module\"> -->\n\n<!-- import { WebR } from 'https://webr.r-wasm.org/latest/webr.mjs'; -->\n<!-- globalThis.webR = new WebR({SW_URL: \"https://bluegreen-labs.github.io/R_book_template/\"}); -->\n<!-- await globalThis.webR.init(); -->\n<!-- globalThis.webRCodeShelter = await new globalThis.webR.Shelter(); -->\n<!-- await globalThis.webR.installPackages(['ggplot2']) -->\n\n<!-- </script> -->\n<!-- ``` -->\n<!-- </html> -->\n\n<!-- ::: callout-warning -->\n<!-- Installing and loading ggplot2 on webR takes a little while. -->\n<!-- ::: -->\n\n<!-- ### Basic plotting -->\n\n<!-- - Load the package and some data -->\n\n<!-- ```{webr} -->\n<!-- library(ggplot2) -->\n<!-- ``` -->\n\n<!-- - create a plot -->\n\n<!-- ```{webr} -->\n<!-- ggplot(data = mtcars, mapping = aes(x = mpg, y = hp)) + -->\n<!-- geom_point() -->\n<!-- ``` -->\n\n<!-- ::: callout-note -->\n<!-- ## Exercise -->\n\n<!-- Make a scatter plot with `hp` on the x axis and `wt` on the y axis. Label the x axis \"Horse Power\" and the y axis \"Weight\". Make one subplot for each value in `gear`. -->\n\n<!-- ```{webr} -->\n<!-- # Add you code here -->\n<!-- ``` -->\n\n<!-- Your result should look like the plot below -->\n<!-- ::: -->\n\n<!-- ::: {.callout-tip collapse=\"true\"} -->\n<!-- ## Expected Result -->\n\n<!-- ```{r} -->\n<!-- #| echo: false -->\n<!-- library(ggplot2) -->\n<!-- ggplot(mtcars, aes(x = hp, y = wt)) + -->\n<!-- geom_point() + -->\n<!-- labs(x = \"Horse Power\", y = \"Weight\") + -->\n<!-- facet_wrap(~gear) -->\n<!-- ``` -->\n<!-- ::: -->\n",
"supporting": [
"ggplot_files"
],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
15 changes: 15 additions & 0 deletions book/_freeze/index/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"hash": "e77676dd916c7ae58f18269bb09cc13a",
"result": {
"engine": "knitr",
"markdown": "# Preface {.unnumbered}\n\n \n::: {.floatting}\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](./figures/agds_logo.png){fig-align='center' width=40% style=\"float:right; padding:20px\"}\n:::\n:::\n\n\n## About this book {.unnumbered}\n\nThis book serves as the basis for the series of courses in *Applied Geodata Science*, taught at the Institute of Geography, University of Bern. The starting point of this book were the tutorials edited by Benjamin Stocker, Loïc Pellissier, and Joshua Payne for the course *Environmental Systems Data Science* (D-USYS, ETH Zürich). The present book was written as a collaborative effort led by [Benjamin Stocker](https://geco-group.org/author/benjamin-stocker/), with contributions by [Pepa Arán](https://geco-group.org/author/pepa-aran/) and [Koen Hufkens](https://geco-group.org/author/koen-hufkens/), and exercises by [Pascal Schneider](https://geco-group.org/author/pascal-schneider/).\n\n:::\n\nThe target of this book are people interested in applying data science methods for research. Methods, example data sets, and prediction challenges are chosen to make the book most relatable to scientists and students in Geography and Environmental Sciences. No prior knowledge of coding is required. Respective essentials are briefly introduced as primers. The focus of this book is not on the theoretical basis of the methods. Other \"classical\" statistics courses serve this purpose. Instead, this book introduces essential concepts, methods, and tools for applied data science in Geography and Environmental Sciences with an emphasis on covering a wide breadth. It is written with a hands-on approach using the R programming language and should enable an intuitive understanding of concepts with only a minimal reliance on mathematical language. Worked examples are provided for typical steps of data science applications in Geography and Environmental Sciences. The aim of this book is to teach the diverse set of skills needed as a basis for data-intensive research in academia and outside.\n\nWe also use this book as a reference and on-boarding resource for group members of [Geocomputation and Earth Observation (GECO)](https://geco-group.org/), at the Institute of Geography, University of Bern.\n\n### Contents\n\nThis book covers all steps along the data science workflow (see Fig. \\@ref(fig:datascienceworkflow)) and introduces methods and tools to learn the most from data, to effectively communicate insights, and to make your workflow reproducible. By following this course, you will be well equipped for joining the Open Science movement.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![The data science workflow and keywords of contents covered in Applied Geodata Science I. Figure adapted from: [Wickham and Grolemund *R for Data Science*](https://r4ds.had.co.nz/index.html)](./figures/data_science_workflow_keywords.png){width=1484}\n:::\n:::\n\n\nThis chapter starts by providing the context for this course: Why Applied Geodata Science? Why now?\n\nChapters \\@ref(gettingstarted) and \\@ref(programmingprimers) serve as primers to get readers with a diverse background and varying data science experience up to speed with the basics for programming in R, which we rely on in later chapters.\n\nChapter \\@ref(datawrangling) introduces efficient handling and cleaning of large tabular data with the R *tidyverse* \"programming dialect\". The focus is on non-geospatial data. Closely related to transforming data and its multiple axes of variation is data visualisation, covered in Chapter \\@ref(datavis).\n\nChapters \\@ref(datavariety), \\@ref(codemgmt), and \\@ref(openscience) introduce essential tools for the daily work with diverse data, for collaborative code development, and for an Open Science practice.\n\nWith Chapters \\@ref(regressionclassification), Chapter \\@ref(supervisedmli), Chapter \\@ref(supervisedmlii), and Chapter \\@ref(randomforest), we will get into modelling and identifying patterns in the data.\n\nChapters \\@ref(gettingstarted)-\\@ref(randomforest) serve as lecture notes for *Applied Geodata Science I* and as learning material for students and scientists in any data-intensive research domain. These chapters are not explicitly dealing with geospatial data and modelling. Modelling with geospatial and temporal data is the subject of the course *Applied Geodata Science II* and will be introduced with a focus on typical applications and modelling tasks in Geography and Environmental Sciences. Respective materials are not currently contained in this book but will be added here later.\n\nAll tutorials use the R programming language, and a full list of the packages used in this course are provided in Appendix \\@ref(references).\n\n\n## Links {-}\n\n[Browse the source code](https://github.com/stineb/agds_book)\n\n[Report an issue](https://github.com/stineb/agds_book/issues)\n\n## License {-}\n\nImages and other materials used here were made available under non-restrictive licenses. Original sources are attributed. Content without attribution is our own and shared under the license below. If there are any errors or any content you find concerning with regard to licensing or other, please [contact us](https://geco-group.org/contact/) or [report an issue](https://github.com/stineb/agds_book/issues). Any feedback, positive or negative, is welcome.\n\n<a rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc/4.0/\"><img src=\"https://i.creativecommons.org/l/by-nc/4.0/88x31.png\" alt=\"Creative Commons License\" style=\"border-width:0\"/></a><br />This work is licensed under a <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc/4.0/\">Creative Commons Attribution-NonCommercial 4.0 International License</a>.\n\n## How to cite this book {-}\n\nBenjamin Stocker, Koen Hufkens, Pepa Arán, & Pascal Schneider. (2023). Applied Geodata Science (v1.0). Zenodo. [![DOI](https://zenodo.org/badge/569245031.svg)](https://zenodo.org/badge/latestdoi/569245031)\n\n<br>\n\n![](./figures/logo_unibern_squid3.png){width=\"30%\"} ![](./figures/geco_logo_fullname.png){width=\"30%\"}\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
Loading

0 comments on commit c8e0467

Please sign in to comment.