Skip to content

Commit

Permalink
improve data cleaning code for BirdLife data (PR #59)
Browse files Browse the repository at this point in the history
# aoh 0.0.2.14

- Update `create_spp_info_data()` to make data cleaning functionality more
  robust for the BirdLife species' range dataset.
- Update built-in helper script for processing area of habitat data to
  include (i) mammal species with terrestrial and freshwater distributions and
  (ii)  mammal species with terrestrial and marine distributions
  (see `inst/scripts/aoh-data.R`)
- New built-in helper script to download all species identifiers from the
  IUCN Red List (see `inst/scripts/iucn-species-list.R`)
  • Loading branch information
jeffreyhanson authored Aug 20, 2024
1 parent f1b6970 commit 5910162
Show file tree
Hide file tree
Showing 285 changed files with 25,165 additions and 5,637 deletions.
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ frc-data-birds-part-5.Rout
frc-data-birds-part-6.Rout
frc-data-mammals.Rout
frc-data-reptiles.Rout
iucn-species-list.csv
^aoh.R$
^customization.R$
codecov.yml
3 changes: 3 additions & 0 deletions .github/workflows/documentation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,9 @@ jobs:
run: |
result <- urlchecker::url_check()
result <- result[!startsWith(result$URL, "https://doi.org/"), , drop = FALSE]
result <- result[!startsWith(result$URL, "https://land.copernicus.eu"), , drop = FALSE]
result <- result[!startsWith(result$URL, "https://www.iucnredlist.org"), , drop = FALSE]
result <- result[!startsWith(result$URL, "https://lpdaac.usgs.gov"), , drop = FALSE]
if (nrow(result) > 0) {
print(result)
stop("Invalid URLs detected")
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ frc-data-birds-part-5.Rout
frc-data-birds-part-6.Rout
frc-data-mammals.Rout
frc-data-reptiles.Rout
iucn-species-list.csv

# system files
.directory
Expand Down
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: aoh
Type: Package
Version: 0.0.2.13
Version: 0.0.2.14
Title: Create Area of Habitat Data
Description: Create Area of Habitat data to characterize species distributions.
Data are produced following procedures outlined by Brooks et al. (2019)
Expand Down Expand Up @@ -73,7 +73,7 @@ SystemRequirements: GDAL (>= 3.0.2) (optional), PROJ (>= 7.2.0) (optional)
URL: https://prioritizr.github.io/aoh/, https://github.com/prioritizr/aoh
BugReports: https://github.com/prioritizr/aoh/issues
VignetteBuilder: knitr
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
Collate:
'internal.R'
'calc_spp_frc_data.R'
Expand Down
16 changes: 12 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,9 @@ prep_lumbierres_habitat_data: inst/scripts/lumbierres-habitat-data.R
R CMD BATCH --no-restore --no-save inst/scripts/lumbierres-habitat-data.R

# process aoh data
aoh_global_data: aoh_amphibians aoh_mammals aoh_reptiles aoh_birds
aoh_global_data: aoh_amphibians aoh_mammals aoh_reptiles aoh_birds aoh_mammals

aoh_mammals: aoh_mammals_land aoh_mammals_land_freshwater aoh_mammals_land_marine

aoh_amphibians:
R CMD BATCH --no-restore --no-save '--args amphibians' inst/scripts/aoh-data.R aoh-data-amphibians.Rout
Expand All @@ -51,8 +53,14 @@ aoh_birds:
R CMD BATCH --no-restore --no-save '--args birds-part-5' inst/scripts/aoh-data.R aoh-data-birds-part-5.Rout
R CMD BATCH --no-restore --no-save '--args birds-part-6' inst/scripts/aoh-data.R aoh-data-birds-part-6.Rout

aoh_mammals:
R CMD BATCH --no-restore --no-save '--args mammals' inst/scripts/aoh-data.R aoh-data-mammals.Rout
aoh_mammals_land:
R CMD BATCH --no-restore --no-save '--args mammals-land' inst/scripts/aoh-data.R aoh-data-mammals-land.Rout

aoh_mammals_land_freshwater:
R CMD BATCH --no-restore --no-save '--args mammals-land-freshwater' inst/scripts/aoh-data.R aoh-data-mammals-land-freshwater.Rout

aoh_mammals_land_marine:
R CMD BATCH --no-restore --no-save '--args mammals-land-marine' inst/scripts/aoh-data.R aoh-data-mammals-land-marine.Rout

aoh_reptiles:
R CMD BATCH --no-restore --no-save '--args reptiles' inst/scripts/aoh-data.R aoh-data-reptiles.Rout
Expand Down Expand Up @@ -132,4 +140,4 @@ purl_vigns:
R --slave -e "lapply(dir('vignettes', '^.*\\\\.Rmd$$'), function(x) knitr::purl(file.path('vignettes', x), gsub('.Rmd', '.R', x, fixed = TRUE)))"
rm -f Rplots.pdf

.PHONY: initc vigns clean data docs readme site test check checkwb build purl_vigns install man spellcheck examples prep_habitat_data prep_elevation_data aoh_reptiles aoh_mammals aoh_birds aoh_amphibians aoh_global_data frc_reptiles frc_mammals frc_birds frc_amphibians frc_global_data
.PHONY: initc vigns clean data docs readme site test check checkwb build purl_vigns install man spellcheck examples prep_habitat_data prep_elevation_data aoh_reptiles aoh_mammals aoh_mammals_land aoh_mammals_land_freshwater aoh_mammals_land_marine aoh_birds aoh_amphibians aoh_global_data frc_reptiles frc_mammals frc_birds frc_amphibians frc_global_data
11 changes: 11 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
# aoh 0.0.2.14

- Update `create_spp_info_data()` to make data cleaning functionality more
robust for the BirdLife species' range dataset.
- Update built-in helper script for processing area of habitat data to
include (i) mammal species with terrestrial and freshwater distributions and
(ii) mammal species with terrestrial and marine distributions
(see `inst/scripts/aoh-data.R`)
- New built-in helper script to download all species identifiers from the
IUCN Red List (see `inst/scripts/iucn-species-list.R`)

# aoh 0.0.2.13

- Update `read_spp_range_data()` and `create_spp_info_data()` to fix
Expand Down
22 changes: 16 additions & 6 deletions R/clean_spp_range_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -402,28 +402,38 @@ clean_spp_range_data <- function(x,
# step 6: convert MULTISURFACE to MULTIPOLYGON
x <- sf::st_set_precision(x, geometry_precision)
idx <- which(vapply(sf::st_geometry(x), inherits, logical(1), "MULTISURFACE"))
if (length(idx) > 0) { # nocov start
# nocov start
if (length(idx) > 0) {
g <- sf::st_geometry(x)
g2 <- lapply(g[idx], sf::st_cast, "MULTIPOLYGON")
g2 <- g[idx]
g2 <- lapply(g2, sf::st_cast, "MULTIPOLYGON")
g2 <- lapply(g2, sf::st_buffer, 0)
g2 <- lapply(g2, sf::st_make_valid)
for (i in seq_along(idx)) {
g[[idx[[i]]]] <- g2[[i]]
}
x <- sf::st_set_geometry(x, g)
rm(g, g2)
} # nocov end
}
# nocov end

# force construction of object, this seems to be needed for some reason
# that I do not understand, otherwise st_collection_extract() throws
# an error
x <- x[seq_len(nrow(x)), , drop = FALSE]
x <- suppressWarnings(sf::st_collection_extract(x, "POLYGON"))
invisible(gc())

# step 7: fix any potential geometry issues
x <- st_repair_geometry(x, geometry_precision)
invisible(gc())

# step 8: wrap geometries to dateline
x <- sf::st_set_precision(x, geometry_precision)
x <- suppressWarnings(sf::st_wrap_dateline(x,
options = c("WRAPDATELINE=YES", "DATELINEOFFSET=180"))
x <- suppressWarnings(
sf::st_wrap_dateline(
x,
options = c("WRAPDATELINE=YES", "DATELINEOFFSET=180")
)
)
invisible(gc())

Expand Down
2 changes: 1 addition & 1 deletion R/st_repair_geometry.R
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ st_repair_geometry <- function(x, geometry_precision = 1e5) {
requireNamespace("prepr", quietly = TRUE),
msg = paste(
"the \"prepr\" package needs to be installed, use: \n",
"remotes::install_github(\"dickoa/prepr\")"
"remotes::install_github(\"prioritizr/prepr\")"
)
)
### find geometries to repair
Expand Down
9 changes: 8 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,14 @@ knitr::opts_chunk$set(
[![Coverage Status](https://img.shields.io/codecov/c/github/prioritizr/aoh?label=Coverage)](https://app.codecov.io/gh/prioritizr/aoh/branch/master)

```{r, include = FALSE}
# load developmental version of package
devtools::load_all()
# check if being prepared for website
## see https://github.com/r-lib/pkgdown/blob/main/R/pkgdown.R
in_pkgdown <- function() {
identical(Sys.getenv("IN_PKGDOWN"), "true")
}
```

### Overview
Expand Down Expand Up @@ -180,7 +187,7 @@ print(spp_aoh_rasters)

Finally, let's create some maps to compare the range data with the Area of habitat data.

```{r "map", message = FALSE, warning = FALSE, results = "hide", dpi = 200, fig.width = 5.5, fig.height = 4, out.width = ifelse(isTRUE(knitr::is_html_output(excludes = c("markdown"))), "60%", "90%")}
```{r "map", message = FALSE, warning = FALSE, results = "hide", dpi = 200, fig.width = 5.5, fig.height = 4, out.width = ifelse(isTRUE(in_pkgdown()), "60%", "90%")}
# create maps
## N.B. you might need to install the ggmap package
map <-
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -354,7 +354,7 @@ map <-
print(map)
```

<img src="man/figures/README-map-1.png" width="60%" style="display: block; margin: auto;" />
<img src="man/figures/README-map-1.png" width="90%" style="display: block; margin: auto;" />

### Citation

Expand All @@ -366,7 +366,7 @@ produce Area of Habitat data.
relevant data using:

Hanson JO (2024) aoh: Create Area of Habitat Data. R package version
0.0.2.12. Available at https://github.com/prioritizr/aoh.
0.0.2.14. Available at https://github.com/prioritizr/aoh.

IUCN [insert year] IUCN Red List of Threatened Species. Version
[insert version]. Available at www.iucnredlist.org.
Expand Down
3 changes: 3 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
url: https://prioritizr.github.io/aoh

authors:
Jeffrey O Hanson:
href: http://jeffrey-hanson.com

template:
bootstrap: 5
params:
bootswatch: flatly

Expand Down
130 changes: 47 additions & 83 deletions docs/404.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 5910162

Please sign in to comment.