You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 14-location.Rmd
+24-26Lines changed: 24 additions & 26 deletions
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ This chapter demonstrates how the skills learned in Parts I and II can be applie
27
27
This is a broad field of research and commercial application.
28
28
A typical example of geomarketing is where to locate a new shop.
29
29
The aim here is to attract most visitors and, ultimately, make the most profit.
30
-
There are also many non-commercial applications that can use the technique for public benefit, for example where to locate new health services [@tomintz_geography_2008].
30
+
There are also many non-commercial applications that can use the technique for public benefit, for example, where to locate new health services [@tomintz_geography_2008].
31
31
32
32
People are fundamental to location analysis\index{location analysis}, in particular where they are likely to spend their time and other resources.
33
33
Interestingly, ecological concepts and models are quite similar to those used for store location analysis.
@@ -44,7 +44,7 @@ Typical research questions include:
44
44
- Do existing services over- or under-utilize the market potential?
45
45
- What is the market share of a company in a specific area?
46
46
47
-
This chapter demonstrates how geocomputation can answer such questions based on a hypothetical case study based on real data.
47
+
This chapter demonstrates how geocomputation can answer such questions based on a hypothetical case study and real data.
48
48
49
49
## Case study: bike shops in Germany {#case-study}
The `census_de` object is a data frame containing 13 variables for more than 360,000 grid cells across Germany.
87
87
For our work, we only need a subset of these: Easting (`x`) and Northing (`y`), number of inhabitants (population; `pop`), mean average age (`mean_age`), proportion of women (`women`) and average household size (`hh_size`).
88
88
These variables are selected and renamed from German into English in the code chunk below and summarized in Table \@ref(tab:census-desc).
89
-
Further, `mutate()` is used to convert values -1 and -9 (meaning "unknown") to `NA`.
89
+
Further, `mutate()` is used to convert values `-1` and `-9` (meaning "unknown") to `NA`.
90
90
91
91
```{r 14-location-4}
92
92
# pop = population, hh_size = household size
@@ -131,11 +131,10 @@ tab = dplyr::tribble(
131
131
# summary(input_factor)
132
132
cap = paste("Categories for each variable in census data from",
133
133
"Datensatzbeschreibung...xlsx",
134
-
"located in the downloaded file census.zip (see Figure",
135
-
"\\@ref(fig:census-stack) for their spatial distribution).")
caption.short = "Categories for each variable in census data.",
141
140
align = "c", booktabs = TRUE)
@@ -156,12 +155,12 @@ input_ras
156
155
```
157
156
158
157
```{block2 14-location-7, type='rmdnote'}
159
-
Note that we are using an equal-area projection (EPSG:3035; Lambert Equal Area Europe), i.e., a projected CRS\index{CRS!projected} where each grid cell has the same area, here 1000 x 1000 square meters.
158
+
Note that we are using an equal-area projection (EPSG:3035; Lambert Equal Area Europe), i.e., a projected CRS\index{CRS!projected} where each grid cell has the same area, here 1000 * 1000 square meters.
160
159
Since we are using mainly densities such as the number of inhabitants or the portion of women per grid cell, it is of utmost importance that the area of each grid cell is the same to avoid 'comparing apples and oranges'.
161
-
Be careful with geographic CRS\index{CRS!geographic} where grid cell areas constantly decrease in poleward directions (see also Section \@ref(crs-intro) and Chapter \@ref(reproj-geo-data)).
160
+
Be careful with geographic CRS\index{CRS!geographic} where grid cell areas constantly decrease in poleward directions (see also Section \@ref(crs-intro) and Chapter \@ref(reproj-geo-data)).
162
161
```
163
162
164
-
```{r census-stack, echo=FALSE, fig.cap="Gridded German census data of 2011 (see Table \\@ref(tab:census-desc) for a description of the classes).", fig.scap="Gridded German census data."}
163
+
```{r census-stack, echo=FALSE, fig.cap="Gridded German census data of 2011 (see Table 14.1 for a description of the classes).", fig.scap="Gridded German census data."}
@@ -172,9 +171,9 @@ A cell value of 8000 inhabitants was chosen for 'class 6' because these cells co
172
171
Of course, these are approximations of the true population, not precise values.^[
173
172
The potential error introduced during this reclassification stage will be explored in the exercises.
174
173
]
175
-
However, the level of detail is sufficient to delineate metropolitan areas (see next section).
174
+
However, the level of detail is sufficient to delineate metropolitan areas (see Section \@ref(define-metropolitan-areas)).
176
175
177
-
In contrast to the `pop` variable, representing absolute estimates of the total population, the remaining variables were re-classified as weights corresponding with weights used in the survey.
176
+
In contrast to the `pop` variable, representing absolute estimates of the total population, the remaining variables were reclassified as weights corresponding with weights used in the survey.
178
177
Class 1 in the variable `women`, for instance, represents areas in which 0 to 40% of the population is female;
179
178
these are reclassified with a comparatively high weight of 3 because the target demographic is predominantly male.
180
179
Similarly, the classes containing the youngest people and highest proportion of single households are reclassified to have high weights.
@@ -216,7 +215,7 @@ reclass # full output not shown
216
215
217
216
We deliberately define metropolitan areas as pixels of 20 km^2^ inhabited by more than 500,000 people.
218
217
Pixels at this coarse resolution can rapidly be created using `aggregate()`\index{aggregation}, as introduced in Section \@ref(aggregation-and-disaggregation).
219
-
The command below uses the argument `fact = 20` to reduce the resolution of the result twenty-fold (recall the original raster resolution was 1 km^2^).
218
+
The command below uses the argument `fact = 20` to reduce the resolution of the result 20-fold (recall the original raster resolution was 1 km^2^).
@@ -266,21 +265,21 @@ To make sure that the reader uses the exact same results, we have put them into
266
265
267
266
```{r metro-names, echo=FALSE}
268
267
data("metro_names", package = "spDataLarge")
269
-
knitr::kable(select(metro_names, city, state),
268
+
knitr::kable(select(metro_names, City = city, State = state),
270
269
caption = "Result of the reverse geocoding.",
271
270
caption.short = "Result of the reverse geocoding.",
272
271
booktabs = TRUE)
273
272
```
274
273
275
-
Overall, we are satisfied with the `city` column serving as metropolitan names (Table \@ref(tab:metro-names)) apart from one exception, namely Velbert which belongs to the greater region of Düsseldorf.
274
+
Overall, we are satisfied with the `City` column serving as metropolitan names (Table \@ref(tab:metro-names)) apart from one exception, namely Velbert which belongs to the greater region of Düsseldorf.
276
275
Hence, we replace Velbert with Düsseldorf (Figure \@ref(fig:metro-areas)).
277
276
Umlauts like `ü` might lead to trouble further on, for example when determining the bounding box of a metropolitan area with `opq()` (see further below), which is why we avoid them.
@@ -296,9 +295,9 @@ The subsequent code chunk does this using a number of functions including:
296
295
-`while()`\index{loop!while}, which tries two more times to download the data if the download failed the first time^[The OSM-download will sometimes fail at the first attempt.
297
296
]
298
297
299
-
Before running this code: please consider it will download almost 2GB of data.
298
+
Before running this code, please consider it will download almost two GB of data.
300
299
To save time and resources, we have put the output named `shops` into **spDataLarge**.
301
-
To make it available in your environment run `data("shops", package = "spDataLarge")`.
300
+
To make it available in your environment, run `data("shops", package = "spDataLarge")`.
302
301
303
302
```{r 14-location-20, eval=FALSE, message=FALSE}
304
303
shops = purrr::map(metro_names, function(x) {
@@ -332,7 +331,7 @@ if (any(ind)) {
332
331
}
333
332
```
334
333
335
-
To make sure that each list element (an `sf`\index{sf} data frame) comes with the same columns^[This is not a given since OSM contributors are not equally meticulous when collecting data.] we only keep the `osm_id` and the `shop` columns with the help of the `map_dfr` loop which additionally combines all shops into one large `sf`\index{sf} object.
334
+
To make sure that each list element (an `sf`\index{sf} data frame) comes with the same columns^[This is not a given since OSM contributors are not equally meticulous when collecting data.], we only keep the `osm_id` and the `shop` columns with the help of the `map_dfr` loop which additionally combines all shops into one large `sf`\index{sf} object.
336
335
337
336
```{r 14-location-22, eval=FALSE}
338
337
# select only specific columns
@@ -381,11 +380,10 @@ poi = classify(poi, rcl = rcl_poi, right = NA)
381
380
names(poi) = "poi"
382
381
```
383
382
384
-
## Identifying suitable locations
383
+
## Identify suitable locations
385
384
386
385
The only steps that remain before combining all the layers are to add `poi` to the `reclass` raster stack and remove the population layer from it.
387
-
The reasoning for the latter is twofold.
388
-
First of all, we have already delineated metropolitan areas, that is areas where the population density is above average compared to the rest of Germany.
386
+
The reasoning for the latter is: First of all, we have already delineated metropolitan areas, that is areas where the population density is above average compared to the rest of Germany.
389
387
Second, though it is advantageous to have many potential customers within a specific catchment area\index{catchment area}, the sheer number alone might not actually represent the desired target group.
390
388
For instance, residential tower blocks are areas with a high population density but not necessarily with a high purchasing power for expensive cycle components.
391
389
@@ -430,14 +428,14 @@ if (knitr::is_latex_output()) {
430
428
431
429
The presented approach is a typical example of the normative usage of a GIS\index{GIS} [@longley_geographic_2015].
432
430
We combined survey data with expert-based knowledge and assumptions (definition of metropolitan areas, defining class intervals, definition of a final score threshold).
433
-
This approach is less suitable for scientific research than applied analysis that provides an evidencebased indication of areas suitable for bike shops that should be compared with other sources of information.
431
+
This approach is less suitable for scientific research than applied analysis that provides an evidence-based indication of areas suitable for bike shops that should be compared with other sources of information.
434
432
A number of changes to the approach could improve the analysis:
435
433
436
434
- We used equal weights when calculating the final scores but other factors, such as the household size, could be as important as the portion of women or the mean age
437
435
- We used all points of interest\index{point of interest} but only those related to bike shops, such as do-it-yourself, hardware, bicycle, fishing, hunting, motorcycles, outdoor and sports shops (see the range of shop values available on the [OSM Wiki](https://wiki.openstreetmap.org/wiki/Map_Features#Shop)) may have yielded more refined results
438
-
- Data at a higher resolution may improve the output (see exercises)
436
+
- Data at a higher resolution may improve the output (see Exercises)
439
437
- We have used only a limited set of variables and data from other sources, such as the [INSPIRE geoportal](https://inspire-geoportal.ec.europa.eu/) or data on cycle paths from OpenStreetMap, may enrich the analysis (see also Section \@ref(retrieving-data))
440
-
- Interactions remained unconsidered, such as a possible relationships between the portion of men and single households
438
+
- Interactions remained unconsidered, such as a possible relationship between the portion of men and single households
441
439
442
440
In short, the analysis could be extended in multiple directions.
443
441
Nevertheless, it should have given you a first impression and understanding of how to obtain and deal with spatial data in R\index{R} within a geomarketing\index{geomarketing} context.
Copy file name to clipboardExpand all lines: _14-ex.Rmd
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -13,8 +13,8 @@ library(spDataLarge)
13
13
14
14
E1. Download the csv file containing inhabitant information for a 100 m cell resolution (https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip?__blob=publicationFile&v=3).
15
15
Please note that the unzipped file has a size of 1.23 GB.
16
-
To read it into R you can use `readr::read_csv`.
17
-
This takes 30 seconds on a machine with 16GB RAM.
16
+
To read it into R, you can use `readr::read_csv`.
17
+
This takes 30 seconds on a machine with 16-GB RAM.
18
18
`data.table::fread()` might be even faster, and returns an object of class `data.table()`.
19
19
Use `dplyr::as_tibble()` to convert it into a tibble.
20
20
Build an inhabitant raster, aggregate it to a cell resolution of 1 km, and compare the difference with the inhabitant raster (`inh`) we have created using class mean values.
0 commit comments