-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add engine-choice as a standard functionality to all indicator calc functions #69
Comments
Here's a small reprex showing use of - ## load sample datasets
library(terra)
rc <- rast("../../Om/test/classified_2000.tif")
roi <- vect("../../Om/test/roi_grids.gpkg")
## area raster
crop <- terra::crop(rc, roi[1, ])
mask <- terra::mask(crop, roi[1, ])
area <- terra::cellSize(mask, unit = "km")
## area of a single aoi
polygon_area <- terra::expanse(roi[1, ], "km")
polygon_area
# [1] 4.988084 Note The values 0 and 1 in column max indicates whether the polygon is affected or not by hurricane & raster resolution is ~111 m. The result from ## zonal
patches_zonal <- terra::zonal(area, mask, "sum")
patches_zonal
# max area
# 1 0 1.189579
# 2 1 4.384729 In case of ## terra extract on area raster
patches_terra <- terra::extract(area, roi[1, ], sum)
patches_terra
# ID area
# 1 1 4.962497 The probable workaround to get area from ## terra extract on mask layer
mask_terra <- terra::extract(mask, roi[1, ])
mask_terra <- aggregate(. ~ max, data = mask_terra, FUN = sum)
mask_terra
# max ID
# 1 0 86
# 2 1 352
area_of_polygon = patches_terra[[2]]
ncells = mask_terra[[2]][1] + mask_terra[[2]][2]
area_per_cell <- area_of_polygon/ncells
area_values_0 <- area_per_cell * mask_terra[[2]][1]
area_values_0
# 0.9743715
area_values_1 <- area_per_cell * mask_terra[[2]][2]
area_values_1
# 3.988125
extract_area <- area_values_0 + area_values_1
extract_area
# 4.962497 Result (area comparison) zonal_area <- sum(patches_zonal$area)
result <- data.frame(polygon_area = polygon_area,
zonal_area = zonal_area,
diff_zonal = zonal_area - polygon_area,
extract_area = extract_area,
diff_extract = extract_area - polygon_area)
result
So, I think the use of engine choices also depends on the complexity of the codes/routines. PS: I am not sure if we have much easier way to compute areal stats using |
It would be more helpful if examples like above include data we all have access to. The current way how the different engines are implemented is shown with the example below. As we can see, there seems to be no difference in the area estimates between the two options using terra and only a slight difference using exactextractr. In my view it boils down to a performance question, that is which implementation is the fastest. Please also note that this is only an area estimation because the original data is unprojected. Again, this is a trade-off between accuracy and performance. library(mapme.biodiversity)
library(sf)
#> Linking to GEOS 3.10.1, GDAL 3.4.0, PROJ 8.2.0; sf_use_s2() is TRUE
library(terra)
#> terra 1.5.30
library(exactextractr)
library(tibble)
clay <- system.file("res", "soilgrids", "clay_0-5cm_mean.tif",
package = "mapme.biodiversity") |>
rast()
aoi <- system.file("extdata", "sierra_de_neiba_478140_2.gpkg",
package = "mapme.biodiversity")
# slightly different area estimates between sf and terra for unprojected geometries
aoi_sf <- read_sf(aoi)
st_area(aoi_sf)
#> 183491611 [m^2]
aoi_terra <- vect(aoi)
expanse(aoi_terra)
#> [1] 182921585
# retrieve areal statistics using terra::extract
clay_binary <- clay < 350
area <- cellSize(clay, unit = "m")
(result_extract <- extract(area * clay_binary, aoi_terra, sum))
#> ID area
#> 1 1 94329419
# retrive areal statistics via terra::zonal
aoi_zone <- rasterize(aoi_terra, clay)
(result_zonal <- zonal(area * clay_binary, aoi_zone, sum))
#> layer area
#> 1 1 94329419
# retrieve areal statistics via exactextractr:exact_extract
(result_exact <- exact_extract(area * clay_binary, aoi_sf, "sum"))
#> [1] 94455208
# calculate difference in percent
result_zonal$area / result_exact * 100
#> [1] 99.86683 Created on 2022-06-11 by the reprex package (v2.0.1) |
Thanks to both of you for your examples. @goergen95 : your example most probably does not show considerable differences because your AOI is rather big (183.492 sqkm) with a small raster as input (see also my issue about this). So it's not purely about performances when having edge cases (small AOIs or coarse raster resolutions). Not sure how exact extract does it though to get more proximate to the real area... |
Status of engine choices availability in current raster calc. functions:
|
Both the terra extract and terra zonal follow the same approach. |
@Ohm-Np masters thesis will show that choosing the right processing strategy for big geospatial data analysis in R depends amongst others on the spatial resolution of the data and the amount of features to be processed. Different packages, particularly
terra
andexactextract
perform differently in different settings, and it is desirable to have the flexibility to choose the best function for a specific task (in addition to being able to parallelize the processing, as our package currently already allows).So my suggestion is to create a standardized "engine chooser function" that allows people to choose e.g. between
terra zonal
,terrax extract
andexact extract
. Similarly, it could be possible to implement this for vector data, letting people choose betweenterra intersection
andsf intersection
. However, that would be probably more difficult since both work with different object classes. Maybe despite the knowledge that Om's thesis is creating around big data analysis, this could be a central contribution from his thesis to the package.fyi: @goergen95
The text was updated successfully, but these errors were encountered: