Skip to content

Commit

Permalink
Attibute datasets previously downloaded from internet, now included w…
Browse files Browse the repository at this point in the history
…ith package
  • Loading branch information
Markus Kainu committed Mar 13, 2024
1 parent 58593a9 commit 18c8c7f
Show file tree
Hide file tree
Showing 14 changed files with 172 additions and 21 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Type: Package
Package: geofi
Title: Access Finnish Geospatial Data
Version: 1.0.14
Date: 2024-02-05
Version: 1.0.15
Date: 2024-03-13
Authors@R: c(
person("Markus", "Kainu", , "markus.kainu@kapsi.fi", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-1376-503X")),
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# geofi 1.0.15

+ Attibute datasets previously downloaded from internet, now included with package

# geofi 1.0.14

+ 2024 regional classifications updated.
+ geofi_joining_attribute_data vignette fixed

Expand Down
37 changes: 37 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -1192,3 +1192,40 @@
#' }
#' @source \url{https://data.stat.fi/api/classifications/v2/classifications}
"municipality_key_2013"


#' Municipality level population data from Sotkanet
#'
#' This dataset contains population data at municipality level pulled from THL (Sotkanet) from 2000 to 2022
#'
#' @format A data frame with 7107 rows and 3 variables:
#' \describe{
#' \item{municipality_code}{municipality_code}
#' \item{primary.value}{primary.value}
#' \item{year}{year}
#' }
"sotkadata_population"

#' Municipality level Swedish speaking population numbers from Sotkanet
#'
#' This dataset contains Swedish speaking population figures at municipality level pulled from THL (Sotkanet) from 2000 to 2022
#'
#' @format A data frame with 5761 rows and 3 variables:
#' \describe{
#' \item{municipality_code}{municipality_code}
#' \item{indicator.title.fi}{indicator.title.fi}
#' \item{primary.value}{primary.value}
#' }
"sotkadata_swedish_speaking_pop"

#' Zipcode level population data from Statistics Finland
#'
#' This dataset contains population for each zipcode in Finland. Data is downloaded from Statistics Finland
#'
#' @format A data frame with 3027 rows and 2 variables:
#' \describe{
#' \item{posti_alue}{posti_alue}
#' \item{X2022}{X2022}
#' }
"statfi_zipcode_population"

9 changes: 9 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ reference:
contents:
- municipality_central_localities
- municipality_key
- municipality_key_2024
- municipality_key_2023
- municipality_key_2022
- municipality_key_2021
Expand Down Expand Up @@ -57,6 +58,13 @@ reference:
- grid_satakunta
- grid_uusimaa
- grid_varsinais_suomi
- title: Attribute data
desc: |
geofi comes with a three statistical datasets that are used in vignettes
contents:
- sotkadata_population
- sotkadata_swedish_speaking_pop
- statfi_zipcode_population
- title: Miscellaneous
desc: |
Support functions for API functions and spatial data transformations
Expand All @@ -65,6 +73,7 @@ reference:
- check_api_access
- wfs_api
- to_sf

url: https://ropengov.github.io/geofi/
template:
package: rogtemplate
Expand Down
6 changes: 4 additions & 2 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
## Test environments
* local ubuntu 22.04 install, R 4.3.2
* local ubuntu 22.04 install, R 4.3.3
* win-builder (devel)
* r-hub.io

## Submission note

+ Failing vignette fixed
+ Attibute datasets previously downloaded from internet, now included with package

## R CMD check results

0 errors | 0 warnings | 0 note



## Downstream dependencies

There are currently no downstream dependencies for this package.
Expand Down
Binary file added data/sotkadata_population.rda
Binary file not shown.
Binary file added data/sotkadata_swedish_speaking_pop.rda
Binary file not shown.
Binary file added data/statfi_zipcode_population.rda
Binary file not shown.
45 changes: 45 additions & 0 deletions inst/extras/vignette_data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Create datasets to be used in Vignettes

library(sotkanet)
library(dplyr)
sotkadata_population <- GetDataSotkanet(indicators = 127, years = 2000:2022) %>%
filter(region.category == "KUNTA") %>%
mutate(municipality_code = as.integer(region.code)) %>%
select(municipality_code,primary.value,year)

save(sotkadata_population, file = "./data/sotkadata_population.rda",
compress = "bzip2")

karttasovellus::document_data(dat = sotkadata_population,
neim = "sotkadata_population",
description = "This dataset contains population data at municipality level pulled from THL (Sotkanet) from 2000 to 2022")


# ******************
sotkadata_swedish_speaking_pop <- GetDataSotkanet(indicators = 2433, years = 2000:2022) %>%
filter(region.category == "KUNTA") %>%
mutate(municipality_code = as.integer(region.code)) %>%
select(municipality_code,indicator.title.fi,primary.value)

save(sotkadata_swedish_speaking_pop, file = "./data/sotkadata_swedish_speaking_pop.rda",
compress = "bzip2")

karttasovellus::document_data(dat = sotkadata_swedish_speaking_pop,
neim = "sotkadata_swedish_speaking_pop",
description = "This dataset contains Swedish speaking population figures at municipality level pulled from THL (Sotkanet) from 2000 to 2022")


# ******************
library(pxweb)
px_data <- read.csv("https://pxdata.stat.fi:443/PxWeb/sq/43d3d0aa-636e-4a4b-bbe1-decae45fc2b4",
header = TRUE, sep = ";", fileEncoding = "Latin1")
px_data$posti_alue <- sub(" .+$", "", px_data$Postinumeroalue)
statfi_zipcode_population <- px_data %>% select(posti_alue,X2022)

save(statfi_zipcode_population, file = "./data/statfi_zipcode_population.rda",
compress = "bzip2")

karttasovellus::document_data(dat = statfi_zipcode_population,
neim = "statfi_zipcode_population",
description = "This dataset contains population for each zipcode in Finland. Data is doanloaded from Statistics Finland")

21 changes: 21 additions & 0 deletions man/sotkadata_population.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 21 additions & 0 deletions man/sotkadata_swedish_speaking_pop.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions man/statfi_zipcode_population.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 3 additions & 6 deletions vignettes/geofi_datasets.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ check_namespaces <- function(pkgs){
return(all(unlist(sapply(pkgs, requireNamespace,quietly = TRUE))))
}
apiacc <- geofi::check_api_access()
pkginst <- check_namespaces(c("sotkanet","geofacet","ggplot2","dplyr"))
pkginst <- check_namespaces(c("geofacet","ggplot2","dplyr"))
apiacc_pkginst <- all(apiacc,pkginst)
```

Expand Down Expand Up @@ -189,14 +189,11 @@ as_tibble(d$results) %>%
print(n = 100)
```

Here is an example where population data at municipality level is pulled from THL from 2000 to 2022, then aggregated at the levels of regions (`maakunta`) and then plotted with ggplot2 using grid `geofi::grid_maakunta`.
Here is an example where population data at municipality level is pulled from THL from 2000 to 2022, then aggregated at the levels of regions (`maakunta`) and then plotted with ggplot2 using grid `geofi::grid_maakunta`. Population data is provided as part of geofi package as `geofi::sotkadata_population`.

```{r geofacet, fig.height = 8, fig.width = 10, eval = apiacc_pkginst}
# Let pull population data from THL
library(sotkanet)
sotkadata <- GetDataSotkanet(indicators = 127, years = 2000:2022) %>%
filter(region.category == "KUNTA") %>%
mutate(municipality_code = as.integer(region.code))
sotkadata <- geofi::sotkadata_population
# lets aggregate population data
dat <- left_join(geofi::municipality_key_2023 %>% select(-year),
Expand Down
17 changes: 6 additions & 11 deletions vignettes/geofi_joining_attribute_data.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ check_namespaces <- function(pkgs){
return(all(unlist(sapply(pkgs, requireNamespace,quietly = TRUE))))
}
apiacc <- geofi::check_api_access()
pkginst <- check_namespaces(c("sotkanet","dplyr","tidyr","janitor","ggplot2"))
pkginst <- check_namespaces(c("dplyr","tidyr","janitor","ggplot2"))
apiacc_pkginst <- all(apiacc,pkginst)
```

Expand All @@ -56,17 +56,14 @@ Municipality data provided by `get_municipalities()`-function contains 77 indica

### Population data from Sotkanet

In this first example we join municipality level indicators of *Swedish-speaking population at year end* from Sotkanet [population data](https://sotkanet.fi/sotkanet/en/haku?g=219),
In this first example we join municipality level indicators of *Swedish-speaking population at year end* from Sotkanet [population data](https://sotkanet.fi/sotkanet/en/haku?g=219), Dataset is provided as part of geofi package as `geofi::sotkadata_swedish_speaking_pop`.

```{r municipality_map, eval = apiacc_pkginst}
library(geofi)
muni <- get_municipalities(year = 2023)
library(sotkanet)
library(dplyr)
sotkadata_swedish_speaking_pop <- GetDataSotkanet(indicators = 2433, years = 2000:2022) %>%
filter(region.category == "KUNTA") %>%
mutate(municipality_code = as.integer(region.code))
sotkadata_swedish_speaking_pop <- geofi::sotkadata_swedish_speaking_pop
```

This is not obvious to all, but have the municipality names in Finnish among other regional breakdowns which allows us to combine the data with spatial data using `municipality_name_fi`-variable.
Expand All @@ -92,12 +89,10 @@ map_data %>%

## Zipcode level

You can download data from [Paavo (Open data by postal code area)](https://pxdata.stat.fi/PXWeb/pxweb/en/Postinumeroalueittainen_avoin_tieto/) using [`pxweb`](https://ropengov.github.io/pxweb/)-package. In this example we will download preformatted population data in `csv` format directly from Statistics Finland and process it to match with spatial zipcode data.
You can download data from [Paavo (Open data by postal code area)](https://pxdata.stat.fi/PXWeb/pxweb/en/Postinumeroalueittainen_avoin_tieto/) using [`pxweb`](https://ropengov.github.io/pxweb/)-package. In this example we use dataset that can be downloaded preformatted in `csv` format directly from Statistics Finland. Population data is provided as part of geofi package as `geofi::statfi_zipcode_population`.

```{r zipcode_with_statistics_finland, eval = apiacc_pkginst}
px_data <- read.csv("https://pxdata.stat.fi:443/PxWeb/sq/43d3d0aa-636e-4a4b-bbe1-decae45fc2b4",
header = TRUE, sep = ";", fileEncoding = "Latin1")
px_data$posti_alue <- sub(" .+$", "", px_data$Postinumeroalue)
statfi_zipcode_population <- geofi::statfi_zipcode_population
```

Before we can join the data, we must extract the numerical postal code from `postal_code_area`-variable.
Expand All @@ -106,7 +101,7 @@ Before we can join the data, we must extract the numerical postal code from `pos
# Lets join with spatial data and plot the area of each zipcode
zipcodes19 <- get_zipcodes(year = 2019)
zipcodes_map <- left_join(zipcodes19,
px_data)
statfi_zipcode_population)
ggplot(zipcodes_map) +
geom_sf(aes(fill = X2022),
color = alpha("white", 1/3)) +
Expand Down

0 comments on commit 18c8c7f

Please sign in to comment.