Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use variables as column names #45

Open
wants to merge 50 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
71a7b52
Add arguments "type", "debug" to get_stations()
hsonne Sep 24, 2023
1db01fc
Make "type" the main (first) argument
hsonne Sep 24, 2023
7e00d07
Merge pull request #41 from KWB-R/improve-get-stations
hsonne Sep 24, 2023
0809baa
Add test to check column names!
hsonne Sep 24, 2023
c58db16
Add tests for three functions
hsonne Sep 24, 2023
e67b87e
Improve/fix error message of assert_date()
hsonne Sep 24, 2023
f660e09
Merge branch 'dev' into add-tests
Sep 24, 2023
563f326
Add test for clean_timestamp_columns()
hsonne Sep 24, 2023
16f83e8
Clean test script
hsonne Sep 24, 2023
d304fc4
Use new argument "type", do not use variable
hsonne Sep 24, 2023
a097212
Add arg "stations_list" to get_groundwater_data()
hsonne Sep 24, 2023
1de92a3
Use get_stations() with arg "type" if possible
hsonne Sep 24, 2023
808d4d3
Merge branch 'dev' into add-tests
Sep 24, 2023
548c93d
Set origin in as.Date()
hsonne Sep 25, 2023
44554c5
Add more tests
hsonne Sep 25, 2023
8ef8859
Merge pull request #42 from KWB-R/add-tests
hsonne Sep 25, 2023
b7b7298
Add more tests
hsonne Sep 26, 2023
6664148
Merge pull request #44 from KWB-R/add-tests
hsonne Oct 2, 2023
4e53e8f
Fix :bug: in tutorial.Rmd: correct element name
hsonne Oct 2, 2023
cc755a2
Load the pipe operator
hsonne Oct 2, 2023
a9e44f4
Separate failing test into three single tests
hsonne Oct 2, 2023
f3efeb9
Check for missing header row
hsonne Oct 2, 2023
b84a98c
Ignore measurement columns in test
hsonne Oct 2, 2023
a0fb799
Remove the failing test, use alternative below!
hsonne Oct 2, 2023
2d7fe90
Handle NA in get_non_external_station_ids()
hsonne Oct 10, 2023
1a1fa7f
Allow read_wasserportal_raw() to fail
hsonne Oct 10, 2023
59f52c0
Add variables as list names in merge_raw_results_single(), so they ca…
ma-z-am Mar 11, 2024
4e0a730
Fix GW level and quality
mrustl Apr 3, 2024
c67bce5
Fix test
mrustl Apr 3, 2024
4cac918
Add surface water quality download function
mrustl Apr 4, 2024
c71546d
Fix to get daily SW data working again!
mrustl Apr 4, 2024
8efabcf
Fix DWC logo
mrustl Apr 5, 2024
614760b
Fix example for new API
mrustl Apr 5, 2024
b40f9e9
Set default < 1900 as oldest GW level data range to 1869-10-14
mrustl Apr 5, 2024
84479d7
Add zip export for SW quality
mrustl Apr 5, 2024
1da0895
Use "shortcuts" to functions from kwb.utils
hsonne Apr 5, 2024
95797be
Fix Roxygen import directive
hsonne Apr 5, 2024
a9420a7
Improve get_station_variables()
hsonne Apr 5, 2024
0ad8d1c
Improve read_wasserportal_raw()
hsonne Apr 5, 2024
3eb7ea6
Improve names and errors in read_wasserportal()
hsonne Apr 5, 2024
4073f15
Add and improve test functions
hsonne Apr 5, 2024
e73f713
Use new function url_parameter_string()
hsonne Apr 5, 2024
29b6268
Add test for stop_if_not_all_in()
hsonne Apr 5, 2024
999bca2
Extract and reuse split_into_lines()
hsonne Apr 5, 2024
0c3f94d
Fix and test split_into_lines()
hsonne Apr 5, 2024
e40f0aa
Extract get_text_response_of_httr_post_request()
hsonne Apr 5, 2024
f2b02ae
Use "shortcut" to kwb.utils::catAndRun()
hsonne Apr 5, 2024
e2632a3
Use more "shortcuts" to kwb.utils' functions
hsonne Apr 5, 2024
6b6563f
Merge pull request #46 from KWB-R/clean
hsonne Apr 5, 2024
50b1df3
Merge branch 'dev' into parameters_as_rownames
hsonne Apr 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 26 additions & 34 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,29 +1,20 @@
Package: wasserportal
Title: R Package with Functions for Scraping Data of
Wasserportal Berlin
Version: 0.3.0
Authors@R:
c(person(given = "Hauke",
family = "Sonnenberg",
role = "aut",
email = "hauke.sonnenberg@kompetenz-wasser.de",
comment = c(ORCID = "0000-0001-9134-2871")),
person(given = "Michael",
family = "Rustler",
role = c("aut","cre"),
email = "michael.rustler@kompetenz-wasser.de",
comment = c(ORCID = "0000-0003-0647-7726")),
person(given = "DWC",
role = "fnd"),
person(given = "IMPETUS",
role = "fnd"),
person(given = "PROMISCES",
role = "fnd"),
person(given = "Kompetenzzentrum Wasser Berlin gGmbH (KWB)",
role = "cph"))
Description: R Package with Functions for Scraping Data of
Wasserportal Berlin (https://wasserportal.berlin.de), which contains
real-time data of surface water and groundwater monitoring stations.
Title: R Package with Functions for Scraping Data of Wasserportal Berlin
Version: 0.4.0
Authors@R: c(
person("Hauke", "Sonnenberg", , "hauke.sonnenberg@kompetenz-wasser.de", role = "aut",
comment = c(ORCID = "0000-0001-9134-2871")),
person("Michael", "Rustler", , "michael.rustler@kompetenz-wasser.de", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-0647-7726")),
person("AD4GD", role = "fnd"),
person("DWC", role = "fnd"),
person("IMPETUS", role = "fnd"),
person("PROMISCES", role = "fnd"),
person("Kompetenzzentrum Wasser Berlin gGmbH (KWB)", role = "cph")
)
Description: R Package with Functions for Scraping Data of Wasserportal
Berlin (https://wasserportal.berlin.de), which contains real-time data
of surface water and groundwater monitoring stations.
License: MIT + file LICENSE
URL: https://github.com/KWB-R/wasserportal
BugReports: https://github.com/KWB-R/wasserportal/issues
Expand All @@ -48,28 +39,29 @@ Suggests:
covr,
DT,
forcats,
htmlwidgets,
janitor,
jsonlite,
leaflet,
ggplot2,
gridExtra,
htmltools,
htmlwidgets,
janitor,
jsonlite,
knitr,
kwb.pkgbuild,
leaflet,
openxlsx,
plotly,
rmarkdown,
sf,
tidyselect,
testthat (>= 3.0.0)
testthat (>= 3.0.0),
tidyselect
VignetteBuilder:
knitr
Remotes:
github::kwb-r/kwb.datetime,
github::kwb-r/kwb.pkgbuild,
github::kwb-r/kwb.utils
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.1
VignetteBuilder: knitr
Config/testthat/edition: 3
RoxygenNote: 7.3.1
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2021-2022 Kompetenzzentrum Wasser Berlin gGmbH (KWB)
Copyright (c) 2021-2024 Kompetenzzentrum Wasser Berlin gGmbH (KWB)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
2 changes: 1 addition & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# MIT License

Copyright (c) 2021-2022 Kompetenzzentrum Wasser Berlin gGmbH (KWB)
Copyright (c) 2021-2024 Kompetenzzentrum Wasser Berlin gGmbH (KWB)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ export(get_groundwater_options)
export(get_overview_options)
export(get_station_variables)
export(get_stations)
export(get_surfacewater_qualities)
export(get_surfacewater_quality)
export(get_surfacewater_variables)
export(get_wasserportal_master_data)
export(get_wasserportal_masters_data)
Expand Down Expand Up @@ -41,6 +43,7 @@ importFrom(dplyr,select_if)
importFrom(fs,dir_create)
importFrom(httr,POST)
importFrom(httr,content)
importFrom(httr,http_error)
importFrom(kwb.datetime,textToEuropeBerlinPosix)
importFrom(kwb.utils,catAndRun)
importFrom(kwb.utils,getAttribute)
Expand Down
9 changes: 9 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
# [wasserportal 0.4.0](https://github.com/KWB-R/wasserportal/releases/tag/v0.4.0) <small>2024-04-05</small>

* New feature: add support for downloading all available surface water quality
data for one or multiple monitoring stations. For details see `get_surfacewater_qualities()`
* Bugfix for groundwater level and quality due to new Wasserportal API
* Add project [AD4GD](https://www.kompetenz-wasser.de/de/forschung/projekte/ad4gd)
as funder


# [wasserportal 0.3.0](https://github.com/KWB-R/wasserportal/releases/tag/v0.3.0) <small>2023-02-19</small>

* Fix errors in GitHub actions: use actions from branches `v2`, `v3`, not from
Expand Down
102 changes: 52 additions & 50 deletions R/.test-surface-water_download.R
Original file line number Diff line number Diff line change
@@ -1,51 +1,53 @@
if (FALSE) {
library(wasserportal)

stations <- wasserportal::get_stations()
stations_crosstable <- stations$crosstable

stations_crosstable_bb <- stations_crosstable %>%
dplyr::filter(stringr::str_detect(.data$Messstellennummer,
pattern = "^[A-Z]{2}_"))

stations_crosstable_berlin <- stations_crosstable %>%
dplyr::filter(stringr::str_detect(.data$Messstellennummer,
pattern = "^[A-Z]{2}_",
negate = TRUE))



station_crosstable_berlin <- stations_crosstable_berlin[1,]
stations_crosstable_berlin
from_date <- "1900-01-01"
sw_station_berlin_daily <- wasserportal::read_wasserportal_raw(
station = station_crosstable_berlin$Messstellennummer,
variable = get_station_variables(stations_crosstable_berlin)[1],
type = "daily",
from_date = from_date,
include_raw_time = TRUE,
stations_crosstable = stations_crosstable
)

str(sw_station_berlin_daily)



sw_stations_berlin_daily <- stats::setNames(lapply(stations_crosstable_berlin$Messstellennummer,
function(station) {
msg <- sprintf("Fetching data for station '%s'", station)
kwb.utils::catAndRun(msg, expr = {
wasserportal::read_wasserportal(
station = station,
type = "daily",
from_date = from_date,
include_raw_time = TRUE,
stations_crosstable = stations_crosstable
)})}
), nm = stations_crosstable$Messstellennummer)

str(sw_stations_daily)


if (FALSE)
{
`%>%` <- magrittr::`%>%`

stations_crosstable <- wasserportal::get_stations(type = "crosstable")

stations_crosstable_bb <- stations_crosstable %>%
dplyr::filter(stringr::str_detect(
.data$Messstellennummer,
pattern = "^[A-Z]{2}_"
))

stations_crosstable_berlin <- stations_crosstable %>%
dplyr::filter(stringr::str_detect(
.data$Messstellennummer,
pattern = "^[A-Z]{2}_",
negate = TRUE
))

stations_crosstable_berlin

from_date <- "1900-01-01"

sw_station_berlin_daily <- wasserportal::read_wasserportal_raw(
station = stations_crosstable_berlin[1L, ] %>%
kwb.utils::selectColumns("Messstellennummer"),
variable = wasserportal::get_station_variables(stations_crosstable_berlin)[1],
type = "daily",
from_date = from_date,
include_raw_time = TRUE,
stations_crosstable = stations_crosstable
)

str(sw_station_berlin_daily)

sw_stations_berlin_daily <- stations_crosstable_berlin %>%
kwb.utils::selectColumns("Messstellennummer") %>%
lapply(function(station) cat_and_run(
sprintf("Fetching data for station '%s'", station),
expr = wasserportal::read_wasserportal(
station = station,
type = "daily",
from_date = from_date,
include_raw_time = TRUE,
stations_crosstable = stations_crosstable
)
)) %>%
stats::setNames(
kwb.utils::selectColumns(stations_crosstable, "Messstellennummer")
)

str(sw_stations_daily)
}

19 changes: 9 additions & 10 deletions R/get_daily_surfacewater_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
#' variables
#' sw_data_daily <- wasserportal::get_daily_surfacewater_data(stations, variables)
#' }
#' @importFrom kwb.utils catAndRun
#' @importFrom dplyr bind_rows filter pull
#' @importFrom stats setNames
get_daily_surfacewater_data <- function(
Expand All @@ -23,17 +22,17 @@ get_daily_surfacewater_data <- function(
)
{
#kwb.utils::assignPackageObjects("wasserportal")
overviews <- kwb.utils::selectElements(stations, "overview_list")
crosstable <- kwb.utils::selectElements(stations, "crosstable")
overviews <- select_elements(stations, "overview_list")
crosstable <- select_elements(stations, "crosstable")

data_frames <- lapply(names(variables), function(variable_name) {

#variable_name <- names(variables)[1L]

kwb.utils::catAndRun(sprintf("Importing '%s'", variable_name), expr = {
cat_and_run(sprintf("Importing '%s'", variable_name), expr = {

# data frame with stations at which <variable_name> is measured
station_data <- kwb.utils::selectElements(overviews, variable_name)
station_data <- select_elements(overviews, variable_name)

# Identifiers of non-external monitoring stations to loop through
station_ids <- get_non_external_station_ids(station_data)
Expand Down Expand Up @@ -87,17 +86,18 @@ get_daily_surfacewater_data <- function(
get_surfacewater_variables <- function()
{
variables <- unlist(get_overview_options())
variables[startsWith(names(variables), "surface")]
variables <- variables[startsWith(names(variables), "surface")]
variables[variables != "opq"]
}

# get_non_external_station_ids -------------------------------------------------
get_non_external_station_ids <- function(station_data)
{
# Function to safely select columns from station_data
pull <- kwb.utils::createAccessor(station_data)
pull <- create_accessor(station_data)

is_external <- is_external_link(pull("stammdaten_link"))
is_berlin <- pull("Betreiber") == "Land Berlin"
is_berlin <- default_if_na(pull("Betreiber"), "") == "Land Berlin"

# Identifiers of monitoring stations to loop through
as.character(pull("Messstellennummer")[is_berlin & !is_external])
Expand All @@ -115,7 +115,6 @@ get_non_external_station_ids <- function(station_data)
#' @importFrom stringr str_detect str_split_fixed
#' @importFrom tibble tibble
#' @importFrom dplyr bind_cols bind_rows
#' @importFrom kwb.utils getAttribute
sw_data_list_to_df <- function (sw_data_list)
{
# Helper function to split parameter string into parameter and unit
Expand All @@ -136,7 +135,7 @@ sw_data_list_to_df <- function (sw_data_list)

# Get its metadata
metadata <- if (!is.null(data)) {
kwb.utils::getAttribute(data, "metadata")
get_attribute(data, "metadata")
} else {
message(sprintf(
"Empty data frame when looping through '%s' in %s",
Expand Down
29 changes: 17 additions & 12 deletions R/get_groundwater_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,16 @@
#'
#' @description wrapper function to scrape all available raw data, i.e. groundwater
#' level and quality data and save in list
#' @param stations stations list as retrieved by \code{\link{get_stations}}
#' @param stations list as retrieved by \code{\link{get_stations}}.
#' Deprecated. Please use \code{stations_list} instead
#' @param groundwater_options as retrieved by \code{\link{get_groundwater_options}}
#' @param debug print debug messages (default: TRUE)
#'
#' @param stations_list list of station metadata as returned by
#' \code{\link{get_stations}(type = "list")}
#' @return list with elements "groundwater.level" and "groundwater.quality" data
#' frames
#' @export
#' @importFrom stats setNames
#' @importFrom kwb.utils catAndRun
#' @importFrom data.table rbindlist
#' @examples
#' \dontrun{
Expand All @@ -21,30 +22,35 @@
get_groundwater_data <- function(
stations,
groundwater_options = get_groundwater_options(),
debug = TRUE
debug = TRUE,
stations_list = NULL
)
{
#kwb.utils::assignPackageObjects("wasserportal")

if (is.null(stations_list)) {
stations_list <- select_elements(stations, "overview_list")
}

result <- lapply(
X = seq_along(groundwater_options),
FUN = function(i) {
option_key <- groundwater_options[i]
option_name <- names(option_key)
kwb.utils::catAndRun(
cat_and_run(
messageText = sprintf(
"Importing '%s' data (%d/%d)",
option_name, i, length(groundwater_options)
),
dbg = debug,
expr = {
ids <- stations %>%
kwb.utils::selectElements("overview_list") %>%
kwb.utils::selectElements(option_name) %>%
kwb.utils::selectColumns("Messstellennummer")
ids <- stations_list %>%
select_elements(option_name) %>%
select_columns("Messstellennummer")
lapply(
X = ids,
FUN = function(id) {
kwb.utils::catAndRun(
cat_and_run(
sprintf(
"Downloading Messstellennummer '%s' (%d/%d)",
id, which(id == ids), length(ids)
Expand Down Expand Up @@ -75,6 +81,5 @@ get_groundwater_options <- function ()

is_groundwater <- startsWith(names(overview_options), "groundwater")

overview_options[is_groundwater] %>%
gsub(pattern = "gws", replacement = "gwl")
overview_options[is_groundwater]
}
3 changes: 2 additions & 1 deletion R/get_overview_options.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ get_overview_options <- function()
conductivity = "olf",
ph = "oph",
oxygen_concentration = "oog",
oxygen_saturation = "oos"
oxygen_saturation = "oos",
quality = "opq"
),
groundwater = list(
level = "gws",
Expand Down
Loading
Loading