Skip to content

Commit 7aec3d0

Browse files
author
jannes
committed
Merge branch 'main' of github.com:geocompx/geocompr
2 parents ca6ce4c + 3ccdc03 commit 7aec3d0

File tree

2 files changed

+26
-28
lines changed

2 files changed

+26
-28
lines changed

14-location.Rmd

Lines changed: 24 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ This chapter demonstrates how the skills learned in Parts I and II can be applie
2727
This is a broad field of research and commercial application.
2828
A typical example of geomarketing is where to locate a new shop.
2929
The aim here is to attract most visitors and, ultimately, make the most profit.
30-
There are also many non-commercial applications that can use the technique for public benefit, for example where to locate new health services [@tomintz_geography_2008].
30+
There are also many non-commercial applications that can use the technique for public benefit, for example, where to locate new health services [@tomintz_geography_2008].
3131

3232
People are fundamental to location analysis\index{location analysis}, in particular where they are likely to spend their time and other resources.
3333
Interestingly, ecological concepts and models are quite similar to those used for store location analysis.
@@ -44,7 +44,7 @@ Typical research questions include:
4444
- Do existing services over- or under-utilize the market potential?
4545
- What is the market share of a company in a specific area?
4646

47-
This chapter demonstrates how geocomputation can answer such questions based on a hypothetical case study based on real data.
47+
This chapter demonstrates how geocomputation can answer such questions based on a hypothetical case study and real data.
4848

4949
## Case study: bike shops in Germany {#case-study}
5050

@@ -86,7 +86,7 @@ data("census_de", package = "spDataLarge")
8686
The `census_de` object is a data frame containing 13 variables for more than 360,000 grid cells across Germany.
8787
For our work, we only need a subset of these: Easting (`x`) and Northing (`y`), number of inhabitants (population; `pop`), mean average age (`mean_age`), proportion of women (`women`) and average household size (`hh_size`).
8888
These variables are selected and renamed from German into English in the code chunk below and summarized in Table \@ref(tab:census-desc).
89-
Further, `mutate()` is used to convert values -1 and -9 (meaning "unknown") to `NA`.
89+
Further, `mutate()` is used to convert values `-1` and `-9` (meaning "unknown") to `NA`.
9090

9191
```{r 14-location-4}
9292
# pop = population, hh_size = household size
@@ -131,11 +131,10 @@ tab = dplyr::tribble(
131131
# summary(input_factor)
132132
cap = paste("Categories for each variable in census data from",
133133
"Datensatzbeschreibung...xlsx",
134-
"located in the downloaded file census.zip (see Figure",
135-
"\\@ref(fig:census-stack) for their spatial distribution).")
134+
"located in the downloaded file census.zip.")
136135
knitr::kable(tab,
137-
col.names = c("Class", "Population", "% female", "Mean age",
138-
"Household size"),
136+
col.names = c("Class", "Population", "% Female", "Mean Age",
137+
"Household Size"),
139138
caption = cap,
140139
caption.short = "Categories for each variable in census data.",
141140
align = "c", booktabs = TRUE)
@@ -156,12 +155,12 @@ input_ras
156155
```
157156

158157
```{block2 14-location-7, type='rmdnote'}
159-
Note that we are using an equal-area projection (EPSG:3035; Lambert Equal Area Europe), i.e., a projected CRS\index{CRS!projected} where each grid cell has the same area, here 1000 x 1000 square meters.
158+
Note that we are using an equal-area projection (EPSG:3035; Lambert Equal Area Europe), i.e., a projected CRS\index{CRS!projected} where each grid cell has the same area, here 1000 * 1000 square meters.
160159
Since we are using mainly densities such as the number of inhabitants or the portion of women per grid cell, it is of utmost importance that the area of each grid cell is the same to avoid 'comparing apples and oranges'.
161-
Be careful with geographic CRS\index{CRS!geographic} where grid cell areas constantly decrease in poleward directions (see also Section \@ref(crs-intro) and Chapter \@ref(reproj-geo-data)).
160+
Be careful with geographic CRS\index{CRS!geographic} where grid cell areas constantly decrease in poleward directions (see also Section \@ref(crs-intro) and Chapter \@ref(reproj-geo-data)).
162161
```
163162

164-
```{r census-stack, echo=FALSE, fig.cap="Gridded German census data of 2011 (see Table \\@ref(tab:census-desc) for a description of the classes).", fig.scap="Gridded German census data."}
163+
```{r census-stack, echo=FALSE, fig.cap="Gridded German census data of 2011 (see Table 14.1 for a description of the classes).", fig.scap="Gridded German census data."}
165164
knitr::include_graphics("images/14_census_stack.png")
166165
```
167166

@@ -172,9 +171,9 @@ A cell value of 8000 inhabitants was chosen for 'class 6' because these cells co
172171
Of course, these are approximations of the true population, not precise values.^[
173172
The potential error introduced during this reclassification stage will be explored in the exercises.
174173
]
175-
However, the level of detail is sufficient to delineate metropolitan areas (see next section).
174+
However, the level of detail is sufficient to delineate metropolitan areas (see Section \@ref(define-metropolitan-areas)).
176175

177-
In contrast to the `pop` variable, representing absolute estimates of the total population, the remaining variables were re-classified as weights corresponding with weights used in the survey.
176+
In contrast to the `pop` variable, representing absolute estimates of the total population, the remaining variables were reclassified as weights corresponding with weights used in the survey.
178177
Class 1 in the variable `women`, for instance, represents areas in which 0 to 40% of the population is female;
179178
these are reclassified with a comparatively high weight of 3 because the target demographic is predominantly male.
180179
Similarly, the classes containing the youngest people and highest proportion of single households are reclassified to have high weights.
@@ -216,7 +215,7 @@ reclass # full output not shown
216215

217216
We deliberately define metropolitan areas as pixels of 20 km^2^ inhabited by more than 500,000 people.
218217
Pixels at this coarse resolution can rapidly be created using `aggregate()`\index{aggregation}, as introduced in Section \@ref(aggregation-and-disaggregation).
219-
The command below uses the argument `fact = 20` to reduce the resolution of the result twenty-fold (recall the original raster resolution was 1 km^2^).
218+
The command below uses the argument `fact = 20` to reduce the resolution of the result 20-fold (recall the original raster resolution was 1 km^2^).
220219

221220
```{r 14-location-11, warning=FALSE, cache=TRUE, cache.lazy=FALSE}
222221
pop_agg = aggregate(reclass$pop, fact = 20, fun = sum, na.rm = TRUE)
@@ -266,21 +265,21 @@ To make sure that the reader uses the exact same results, we have put them into
266265

267266
```{r metro-names, echo=FALSE}
268267
data("metro_names", package = "spDataLarge")
269-
knitr::kable(select(metro_names, city, state),
268+
knitr::kable(select(metro_names, City = city, State = state),
270269
caption = "Result of the reverse geocoding.",
271270
caption.short = "Result of the reverse geocoding.",
272271
booktabs = TRUE)
273272
```
274273

275-
Overall, we are satisfied with the `city` column serving as metropolitan names (Table \@ref(tab:metro-names)) apart from one exception, namely Velbert which belongs to the greater region of Düsseldorf.
274+
Overall, we are satisfied with the `City` column serving as metropolitan names (Table \@ref(tab:metro-names)) apart from one exception, namely Velbert which belongs to the greater region of Düsseldorf.
276275
Hence, we replace Velbert with Düsseldorf (Figure \@ref(fig:metro-areas)).
277276
Umlauts like `ü` might lead to trouble further on, for example when determining the bounding box of a metropolitan area with `opq()` (see further below), which is why we avoid them.
278277

279278
```{r 14-location-19}
280279
metro_names = metro_names$city |>
281280
as.character() |>
282-
{\(x) ifelse(x == "Velbert", "Düsseldorf", x)}() |>
283-
{\(x) gsub("ü", "ue", x)}()
281+
(\(x) ifelse(x == "Velbert", "Düsseldorf", x))() |>
282+
gsub("ü", "ue", x = _)
284283
```
285284

286285
## Points of interest
@@ -296,9 +295,9 @@ The subsequent code chunk does this using a number of functions including:
296295
- `while()`\index{loop!while}, which tries two more times to download the data if the download failed the first time^[The OSM-download will sometimes fail at the first attempt.
297296
]
298297

299-
Before running this code: please consider it will download almost 2GB of data.
298+
Before running this code, please consider it will download almost two GB of data.
300299
To save time and resources, we have put the output named `shops` into **spDataLarge**.
301-
To make it available in your environment run `data("shops", package = "spDataLarge")`.
300+
To make it available in your environment, run `data("shops", package = "spDataLarge")`.
302301

303302
```{r 14-location-20, eval=FALSE, message=FALSE}
304303
shops = purrr::map(metro_names, function(x) {
@@ -332,7 +331,7 @@ if (any(ind)) {
332331
}
333332
```
334333

335-
To make sure that each list element (an `sf`\index{sf} data frame) comes with the same columns^[This is not a given since OSM contributors are not equally meticulous when collecting data.] we only keep the `osm_id` and the `shop` columns with the help of the `map_dfr` loop which additionally combines all shops into one large `sf`\index{sf} object.
334+
To make sure that each list element (an `sf`\index{sf} data frame) comes with the same columns^[This is not a given since OSM contributors are not equally meticulous when collecting data.], we only keep the `osm_id` and the `shop` columns with the help of the `map_dfr` loop which additionally combines all shops into one large `sf`\index{sf} object.
336335

337336
```{r 14-location-22, eval=FALSE}
338337
# select only specific columns
@@ -381,11 +380,10 @@ poi = classify(poi, rcl = rcl_poi, right = NA)
381380
names(poi) = "poi"
382381
```
383382

384-
## Identifying suitable locations
383+
## Identify suitable locations
385384

386385
The only steps that remain before combining all the layers are to add `poi` to the `reclass` raster stack and remove the population layer from it.
387-
The reasoning for the latter is twofold.
388-
First of all, we have already delineated metropolitan areas, that is areas where the population density is above average compared to the rest of Germany.
386+
The reasoning for the latter is: First of all, we have already delineated metropolitan areas, that is areas where the population density is above average compared to the rest of Germany.
389387
Second, though it is advantageous to have many potential customers within a specific catchment area\index{catchment area}, the sheer number alone might not actually represent the desired target group.
390388
For instance, residential tower blocks are areas with a high population density but not necessarily with a high purchasing power for expensive cycle components.
391389

@@ -430,14 +428,14 @@ if (knitr::is_latex_output()) {
430428

431429
The presented approach is a typical example of the normative usage of a GIS\index{GIS} [@longley_geographic_2015].
432430
We combined survey data with expert-based knowledge and assumptions (definition of metropolitan areas, defining class intervals, definition of a final score threshold).
433-
This approach is less suitable for scientific research than applied analysis that provides an evidence based indication of areas suitable for bike shops that should be compared with other sources of information.
431+
This approach is less suitable for scientific research than applied analysis that provides an evidence-based indication of areas suitable for bike shops that should be compared with other sources of information.
434432
A number of changes to the approach could improve the analysis:
435433

436434
- We used equal weights when calculating the final scores but other factors, such as the household size, could be as important as the portion of women or the mean age
437435
- We used all points of interest\index{point of interest} but only those related to bike shops, such as do-it-yourself, hardware, bicycle, fishing, hunting, motorcycles, outdoor and sports shops (see the range of shop values available on the [OSM Wiki](https://wiki.openstreetmap.org/wiki/Map_Features#Shop)) may have yielded more refined results
438-
- Data at a higher resolution may improve the output (see exercises)
436+
- Data at a higher resolution may improve the output (see Exercises)
439437
- We have used only a limited set of variables and data from other sources, such as the [INSPIRE geoportal](https://inspire-geoportal.ec.europa.eu/) or data on cycle paths from OpenStreetMap, may enrich the analysis (see also Section \@ref(retrieving-data))
440-
- Interactions remained unconsidered, such as a possible relationships between the portion of men and single households
438+
- Interactions remained unconsidered, such as a possible relationship between the portion of men and single households
441439

442440
In short, the analysis could be extended in multiple directions.
443441
Nevertheless, it should have given you a first impression and understanding of how to obtain and deal with spatial data in R\index{R} within a geomarketing\index{geomarketing} context.

_14-ex.Rmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@ library(spDataLarge)
1313

1414
E1. Download the csv file containing inhabitant information for a 100 m cell resolution (https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip?__blob=publicationFile&v=3).
1515
Please note that the unzipped file has a size of 1.23 GB.
16-
To read it into R you can use `readr::read_csv`.
17-
This takes 30 seconds on a machine with 16 GB RAM.
16+
To read it into R, you can use `readr::read_csv`.
17+
This takes 30 seconds on a machine with 16-GB RAM.
1818
`data.table::fread()` might be even faster, and returns an object of class `data.table()`.
1919
Use `dplyr::as_tibble()` to convert it into a tibble.
2020
Build an inhabitant raster, aggregate it to a cell resolution of 1 km, and compare the difference with the inhabitant raster (`inh`) we have created using class mean values.

0 commit comments

Comments
 (0)