From 9326ce554f4bef6da4e21fd34d07039b1227b9df Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 09:01:12 +0100 Subject: [PATCH 01/14] 10th -> tenth --- 13-transport.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index cc70aea64..2c082cb2e 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -97,7 +97,7 @@ tm_shape(region_all[1, ], bbox = region_all) + knitr::include_graphics("images/13_bristol.png") ``` -Bristol is the 10^th^ largest city council in England, with a population of half a million people, although its travel catchment area\index{catchment area} is larger (see Section \@ref(transport-zones)). +Bristol is the tenth largest city council in England, with a population of half a million people, although its travel catchment area\index{catchment area} is larger (see Section \@ref(transport-zones)). It has a vibrant economy with aerospace, media, financial service and tourism companies, alongside two major universities. Bristol shows a high average income per person but also contains areas of severe deprivation [@bristol_city_council_deprivation_2015]. From 8f9fbc4a9ba57211330e847848d455eb28d61458 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 09:08:20 +0100 Subject: [PATCH 02/14] c13 intro proofing --- 03-attribute-operations.Rmd | 2 +- 05-geometry-operations.Rmd | 2 +- 10-gis.Rmd | 2 +- 13-transport.Rmd | 17 +++++++++-------- code/02-sfheaders.R | 2 +- code/chapters/03-attribute-operations.R | 2 +- code/chapters/05-geometry-operations.R | 2 +- 7 files changed, 15 insertions(+), 14 deletions(-) diff --git a/03-attribute-operations.Rmd b/03-attribute-operations.Rmd index 788504e41..354070e4d 100644 --- a/03-attribute-operations.Rmd +++ b/03-attribute-operations.Rmd @@ -197,7 +197,7 @@ Base R functions are mature, stable and widely used, making them a rock solid ch Key functions for subsetting data frames (including `sf` data frames) with **dplyr** functions are demonstrated below. ```{r, echo=FALSE, eval=FALSE} -# Aim: benchmark base vs dplyr subsetting +# Aim: benchmark base vs. dplyr subsetting # Could move elsewhere? i = sample(nrow(world), size = 10) benchmark_subset = bench::mark( diff --git a/05-geometry-operations.Rmd b/05-geometry-operations.Rmd index c95e854ec..4c288021b 100644 --- a/05-geometry-operations.Rmd +++ b/05-geometry-operations.Rmd @@ -153,7 +153,7 @@ nz_pos = st_point_on_surface(nz) seine_pos = st_point_on_surface(seine) ``` -```{r centr, warning=FALSE, echo=FALSE, fig.cap="Centroids (black points) and 'points on surface' (red points) of New Zealand's regions (left) and the Seine (right) datasets.", fig.scap="Centroid vs point on surface operations."} +```{r centr, warning=FALSE, echo=FALSE, fig.cap="Centroids (black points) and 'points on surface' (red points) of New Zealand's regions (left) and the Seine (right) datasets.", fig.scap="Centroid vs. point on surface operations."} p_centr1 = tm_shape(nz) + tm_polygons(col = "gray80", fill = "gray90") + tm_shape(nz_centroid) + tm_symbols(shape = 1, col = "black", size = 0.5) + tm_shape(nz_pos) + tm_symbols(shape = 1, col = "red", size = 0.5) + diff --git a/10-gis.Rmd b/10-gis.Rmd index 57203bab3..0734e8b91 100644 --- a/10-gis.Rmd +++ b/10-gis.Rmd @@ -50,7 +50,7 @@ According to the creator of the popular QGIS software [@sherman_desktop_2008]: > With the advent of 'modern' GIS software, most people want to point and click their way through life. That’s good, but there is a tremendous amount of flexibility and power waiting for you with the command line. Many times you can do something on the command line in a fraction of the time you can do it with a GUI. -The 'CLI vs GUI' debate does not have to be adversarial: both ways of working have advantages, depending on a range of factors including the task (with drawing new features being well suited to GUIs), the level of reproducibility desired, and the user's skillset. +The 'CLI vs. GUI' debate does not have to be adversarial: both ways of working have advantages, depending on a range of factors including the task (with drawing new features being well suited to GUIs), the level of reproducibility desired, and the user's skillset. GRASS GIS is a good example of GIS software that is primarily based on a CLI but which also has a prominent GUI. Likewise, while R is focused on its CLI, IDEs such as RStudio provide a GUI for improving accessibility. Software cannot be neatly categorized into CLI or GUI-based. diff --git a/13-transport.Rmd b/13-transport.Rmd index 2c082cb2e..7e47f00bf 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -101,8 +101,8 @@ Bristol is the tenth largest city council in England, with a population of half It has a vibrant economy with aerospace, media, financial service and tourism companies, alongside two major universities. Bristol shows a high average income per person but also contains areas of severe deprivation [@bristol_city_council_deprivation_2015]. -In terms of transport, Bristol is well served by rail and road links, and has a relatively high level of active travel. -19% of its citizens cycle and 88% walk at least once per month according to the [Active People Survey](https://www.gov.uk/government/statistical-data-sets/how-often-and-time-spent-walking-and-cycling-at-local-authority-level-cw010#table-cw0103) (the national average is 15% and 81%, respectively). +In terms of transport, Bristol is well served by rail and road links, and it has a relatively high level of active travel. +According to the [Active People Survey](https://www.gov.uk/government/statistical-data-sets/how-often-and-time-spent-walking-and-cycling-at-local-authority-level-cw010#table-cw0103), 19% of its citizens cycle and 88% walk at least once per month (the national average is 15% and 81%, respectively). 8% of the population said they cycled to work in the 2011 census, compared with only 3% nationwide. ```{r 13-transport-3, eval=FALSE, echo=FALSE} @@ -118,7 +118,7 @@ View(cw0103) ``` Like many cities, Bristol has major congestion, air quality and physical inactivity problems. -Cycling can tackle all of these issues efficiently: it has a greater potential to replace car trips than walking, with typical [speeds](https://en.wikipedia.org/wiki/Bicycle_performance) of 15-20 km/h vs 4-6 km/h for walking. +Cycling can tackle all of these issues efficiently: it has a greater potential to replace car trips than walking, with typical [speeds](https://en.wikipedia.org/wiki/Bicycle_performance) of 15-20 km/h vs. 4-6 km/h for walking. For this reason, Bristol's [Transport Strategy](https://www.bristol.gov.uk/council-and-mayor/policies-plans-and-strategies/bristol-transport-strategy) has ambitious plans for cycling. To highlight the importance of policy considerations in transportation research, this chapter is guided by the need to provide evidence for people (transport planners, politicians and other stakeholders) tasked with getting people out of cars and onto more sustainable modes --- walking and cycling in particular. @@ -151,7 +151,7 @@ See the inner blue boundary in Figure \@ref(fig:bristol): there are a couple of - The first boundary returned by OSM may not be the official boundary used by local authorities - Even if OSM returns the official boundary, this may be inappropriate for transport research because they bear little relation to where people travel -Travel to Work Areas (TTWAs) address these issues by creating a zoning system analogous to hydrological watersheds. +Travel To Work Areas (TTWAs) address these issues by creating a zoning system analogous to hydrological watersheds. TTWAs were first defined as contiguous zones within which 75% of the population travels to work [@coombes_efficient_1986], and this is the definition used in this chapter. Because Bristol is a major employer attracting travel from surrounding towns, its TTWA is substantially larger than the city bounds (see Figure \@ref(fig:bristol)). The polygon representing this transport-orientated boundary is stored in the object `bristol_ttwa`, provided by the **spDataLarge** package loaded at the beginning of this chapter. @@ -160,7 +160,8 @@ The origin and destination zones used in this chapter are the same: officially d Each houses around 8,000 people. Such administrative zones can provide vital context to transport analysis, such as the type of people who might benefit most from particular interventions (e.g., @moreno-monroy_public_2017). -The geographic resolution of these zones is important: small zones with high geographic resolution are usually preferable but their high number in large regions can have consequences for processing (especially for origin-destination analysis in which the number of possibilities increases as a non-linear function of the number of zones) [@hollander_transport_2016]. +The geographic resolution of these zones is important: small zones with high geographic resolution are usually preferable, but their high number in large regions can have consequences for processing. +This is especially true for origin-destination (OD) analysis in which the number of possibilities increases as a non-linear function of the number of zones [@hollander_transport_2016]. ```{block 13-transport-4, type='rmdnote'} Another issue with small zones is related to anonymity rules. @@ -179,7 +180,7 @@ names(bristol_zones) To add travel data, we will perform an *attribute join*\index{attribute!join}, a common task described in Section \@ref(vector-attribute-joining). We will use travel data from the UK's 2011 census question on travel to work, data stored in `bristol_od`, which was provided by the [ons.gov.uk](https://www.ons.gov.uk/help/localstatistics) data portal. -`bristol_od` is an origin-destination (OD) dataset on travel to work between zones from the UK's 2011 Census (see Section \@ref(desire-lines)). +`bristol_od` is an OD dataset on travel to work between zones from the UK's 2011 Census (see Section \@ref(desire-lines)). The first column is the ID of the zone of origin and the second column is the zone of destination. `bristol_od` has more rows than `bristol_zones`, representing travel *between* zones rather than the zones themselves: @@ -188,7 +189,7 @@ nrow(bristol_od) nrow(bristol_zones) ``` -The results of the previous code chunk shows that there are more than 10 OD pairs for every zone, meaning we will need to aggregate the origin-destination data before it is joined with `bristol_zones`, as illustrated below (origin-destination data is described in Section \@ref(desire-lines)). +The results of the previous code chunk shows that there are more than 10 OD pairs for every zone, meaning we will need to aggregate the origin-destination data before it is joined with `bristol_zones`, as illustrated below (OD data is described in Section \@ref(desire-lines)). ```{r 13-transport-7} zones_attr = bristol_od |> @@ -266,7 +267,7 @@ Typically, desire lines are represented geographically as starting and ending in This is the type of desire line that we will create and use in this section, although it is worth being aware of 'jittering' techniques that enable multiple start and end points to increase the spatial coverage and accuracy of analyses building on OD data [@lovelace_jittering_2022b]. We have already loaded data representing desire lines in the dataset `bristol_od`. -This origin-destination (OD) data frame object represents the number of people traveling between the zone represented in `o` and `d`, as illustrated in Table \@ref(tab:od). +This data frame represents the number of people traveling between the zone represented in `o` and `d`, as illustrated in Table \@ref(tab:od). To arrange the OD data by all trips and then filter-out only the top 5, type (please refer to Chapter \@ref(attr) for a detailed description of non-spatial attribute operations): ```{r 13-transport-12} diff --git a/code/02-sfheaders.R b/code/02-sfheaders.R index a53368a32..95e333740 100644 --- a/code/02-sfheaders.R +++ b/code/02-sfheaders.R @@ -1,4 +1,4 @@ -# Aim: compare sf vs sfheaders in terms of speed +# Aim: compare sf vs. sfheaders in terms of speed library(spData) library(sf) diff --git a/code/chapters/03-attribute-operations.R b/code/chapters/03-attribute-operations.R index bfe01b73e..e77bf3845 100644 --- a/code/chapters/03-attribute-operations.R +++ b/code/chapters/03-attribute-operations.R @@ -95,7 +95,7 @@ small_countries = world[world$area_km2 < 10000, ] ## ---- echo=FALSE, eval=FALSE------------------------------------------------------------------------ -## # Aim: benchmark base vs dplyr subsetting +## # Aim: benchmark base vs. dplyr subsetting ## # Could move elsewhere? ## i = sample(nrow(world), size = 10) ## benchmark_subset = bench::mark( diff --git a/code/chapters/05-geometry-operations.R b/code/chapters/05-geometry-operations.R index 7e538950c..d67cb12d2 100644 --- a/code/chapters/05-geometry-operations.R +++ b/code/chapters/05-geometry-operations.R @@ -63,7 +63,7 @@ nz_pos = st_point_on_surface(nz) seine_pos = st_point_on_surface(seine) -## ----centr, warning=FALSE, echo=FALSE, fig.cap="Centroids (black points) and 'points on surface' (red points) of New Zealand's regions (left) and the Seine (right) datasets.", fig.scap="Centroid vs point on surface operations."---- +## ----centr, warning=FALSE, echo=FALSE, fig.cap="Centroids (black points) and 'points on surface' (red points) of New Zealand's regions (left) and the Seine (right) datasets.", fig.scap="Centroid vs. point on surface operations."---- p_centr1 = tm_shape(nz) + tm_borders() + tm_shape(nz_centroid) + tm_symbols(shape = 1, col = "black", size = 0.5) + tm_shape(nz_pos) + tm_symbols(shape = 1, col = "red", size = 0.5) From bf16455ea91fa7cb26520236ac52822b0d3a5959 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 09:15:18 +0100 Subject: [PATCH 03/14] Proofing 13.6 --- 13-transport.Rmd | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index 7e47f00bf..18027fc10 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -226,18 +226,18 @@ names(zones_joined) ``` The result is `zones_joined`, which contains new columns representing the total number of trips originating in each zone in the study area (almost 1/4 of a million) and their mode of travel (by bicycle, foot, car and train). -The geographic distribution of trip origins is illustrated in the left-hand map in Figure \@ref(fig:zones). +The geographic distribution of trip origins is illustrated in the left-hand panel in Figure \@ref(fig:zones). This shows that most zones have between 0 and 4,000 trips originating from them in the study area. More trips are made by people living near the center of Bristol and fewer on the outskirts. Why is this? Remember that we are only dealing with trips within the study region: low trip numbers in the outskirts of the region can be explained by the fact that many people in these peripheral zones will travel to other regions outside of the study area. -Trips outside the study region can be included in regional model by a special destination ID covering any trips that go to a zone not represented in the model [@hollander_transport_2016]. +Trips outside the study region can be included in a regional model by a special destination ID covering any trips that go to a zone not represented in the model [@hollander_transport_2016]. The data in `bristol_od`, however, simply ignores such trips: it is an 'intra-zonal' model. In the same way that OD datasets can be aggregated to the zone of origin, they can also be aggregated to provide information about destination zones. People tend to gravitate towards central places. This explains why the spatial distribution represented in the right panel in Figure \@ref(fig:zones) is relatively uneven, with the most common destination zones concentrated in Bristol city center. -The result is `zones_od`, which contains a new column reporting the number of trip destinations by any mode, is created as follows: +The result is `zones_od`, which contains a new column reporting the number of trip destinations by any mode, and it iscreated as follows: ```{r 13-transport-10} zones_destinations = bristol_od |> @@ -254,7 +254,7 @@ qtm(zones_od, c("all", "all_dest")) + tm_layout(panel.labels = c("Origin", "Destination")) ``` -```{r zones, echo=FALSE, fig.cap="Number of trips (commuters) living and working in the region. The left map shows zone of origin of commute trips; the right map shows zone of destination (generated by the script 13-zones.R).", message=FALSE, fig.scap="Number of trips (commuters) living and working in the region."} +```{r zones, echo=FALSE, fig.cap="Number of trips (commuters) living and working in the region. The left map shows zone of origin of commute trips; the right map shows zone of destination (generated by the script `13-zones.R`).", message=FALSE, fig.scap="Number of trips (commuters) living and working in the region."} # file.edit("code/13-zones.R") source("code/13-zones.R", print.eval = TRUE) ``` @@ -286,7 +286,7 @@ od_top5 |> ``` The resulting table provides a snapshot of Bristolian travel patterns in terms of commuting (travel to work). -It demonstrates that walking is the most popular mode of transport among the top 5 origin-destination pairs, that zone `E02003043` is a popular destination (Bristol city center, the destination of all the top 5 OD pairs), and that the *intrazonal* trips, from one part of zone `E02003043` to another (first row of Table \@ref(tab:od)), constitute the most traveled OD pair in the dataset. +It demonstrates that walking is the most popular mode of transport among the top 5 OD pairs, that zone `E02003043` is a popular destination (Bristol city center, the destination of all the top 5 OD pairs), and that the *intrazonal* trips, from one part of zone `E02003043` to another (first row of Table \@ref(tab:od)), constitute the most traveled OD pair in the dataset. But from a policy perspective, the raw data presented in Table \@ref(tab:od) is of limited use: aside from the fact that it contains only a tiny portion of the 2,910 OD pairs, it tells us little about *where* policy measures are needed, or *what proportion* of trips are made by walking and cycling. The following command calculates the percentage of each desire line that is made by these active modes: @@ -382,9 +382,9 @@ ncol(desire_rail) ``` As illustrated in Figure \@ref(fig:stations), the initial `desire_rail` lines now have three additional geometry list columns\index{list column} representing travel from home to the origin station, from there to the destination, and finally from the destination station to the destination. -In this case, the destination leg is very short (walking distance) but the origin legs may be sufficiently far to justify investment in cycling infrastructure to encourage people to cycle to the stations on the outward leg of peoples' journey to work in the residential areas surrounding the three origin stations in Figure \@ref(fig:stations). +In this case, the destination leg is very short (walking distance), but the origin legs may be sufficiently far to justify investment in cycling infrastructure to encourage people to cycle to the stations on the outward leg of peoples' journey to work in the residential areas surrounding the three origin stations in Figure \@ref(fig:stations). -```{r stations, echo=FALSE, message=FALSE, warning=FALSE, fig.cap="Station nodes (red dots) used as intermediary points that convert straight desire lines with high rail usage (thin green lines) into three legs: to the origin station (orange) via public transport (blue) and to the destination (pink, not visible because it is so short).", fig.scap="Station nodes."} +```{r stations, echo=FALSE, message=FALSE, warning=FALSE, fig.cap="Station nodes (red dots) used as intermediary points that convert straight desire lines with high rail usage (thin green lines) into three legs: to the origin station (orange) via public transport (blue) and to the destination (pink, not visible because it is short).", fig.scap="Station nodes."} # zone_cents = st_centroid(zones_od) zone_cents = st_centroid(zones_od) zone_cents_rail = zone_cents[desire_rail, ] @@ -430,7 +430,7 @@ From a geographical perspective, routes are desire lines\index{desire lines} tha The geometries of routes are typically (but not always) determined by the transport network. While desire lines contain only two vertices (their beginning and end points), routes can contain any number of vertices, representing points between A and B joined by straight lines: the definition of a linestring geometry. -Routes covering large distances or following intricate network can have many hundreds of vertices; routes on grid-based or simplified road networks tend to have fewer. +Routes covering large distances or following intricate networks can have thousands of vertices; routes on grid-based or simplified road networks tend to have fewer. Routes are generated from desire lines or, more commonly, matrices containing coordinate pairs representing desire lines. This routing process is done by a range of broadly-defined *routing engines*: software and web services that return geometries and attributes describing how to get from origins to destinations. @@ -443,13 +443,13 @@ Routing engines can be classified based on *where* they run relative to R: Before describing each, it is worth outlining other ways of categorizing routing engines. Routing engines can be multi-modal, meaning that they can calculate trips composed of more than one mode of transport, or not. Multi-modal routing engines can return results consisting of multiple *legs*, each one made by a different mode of transport. -The optimal route from a residential area to a commercial area could involve 1) walking to the nearest bus stop, 2) catching the bus to the nearest node to the destination, and 3) walking to the destination, given a set of input parameters. +The optimal route from a residential area to a commercial area could involve (1) walking to the nearest bus stop, (2) catching the bus to the nearest node to the destination, and (3) walking to the destination, given a set of input parameters. The transition points between these three legs are commonly referred to as 'ingress' and 'egress', meaning getting on/off a public transport vehicle. -Multi-modal routing engines such as R5 are more sophisticated and have larger input data requirements than 'uni-modal' routing engines such as OSRM (described in Section \@ref(localengine)). +Multi-modal routing engines such as R5 are more sophisticated and have larger input data requirements than 'uni-modal' routing engines such as the OpenStreetMap Routing Machine (OSRM), described in Section \@ref(localengine). -A major strength of multi-modal engines is their ability to represent 'transit' (public transport) trips by trains, buses etc. +A major strength of multi-modal engines is their ability to represent 'transit' (public transport) trips by trains, buses, etc. Multi-model routing engines require input datasets representing public transport networks, typically in General Transit Feed Specification ([GTFS](https://developers.google.com/transit/gtfs)) files, which can be processed with functions in the [**tidytransit**](https://r-transit.github.io/tidytransit/index.html) and [**gtfstools**](https://ipeagit.github.io/gtfstools/) packages (other packages and tools for working with GTFS files are available). -Single mode routing engines may be sufficient for projects focused on specific (non public) modes of transport. +Single mode routing engines may be sufficient for projects focused on specific (non-public) modes of transport. Another way of classifying routing engines (or settings) is by the geographic level of the outputs: routes, legs and segments. From f4bdc57dd4b46671a8db14f2af334a9de6a36346 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 09:19:39 +0100 Subject: [PATCH 04/14] egment level -> egment-level --- 13-transport.Rmd | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index 18027fc10..aa9850d16 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -458,15 +458,15 @@ Another way of classifying routing engines (or settings) is by the geographic le Routing engines can generate outputs at three geographic levels of routes, legs and segments: - **Route** level outputs contain a single feature (typically a multilinestring and associated row in the data frame representation) per origin-destination pair, meaning a single row of data per trip -- **Leg** level outputs contain a single feature and associated attributes each *mode* within each origin-destination pair, as described in Section \@ref(nodes). For trips only involving one mode (for example driving from home to work, ignoring the short walk to the car) the leg is the same as the route: the car journey. For trips involving public transport, legs provide key information. The **r5r** function `detailed_itineraries()` returns legs which, confusingly, are sometimes referred to as 'segments' -- Segment level outputs provide the most detailed information about routes, with records for each small section of the transport network. Typically segments are similar in length, or identical to, ways in OpenStreetMap. The **cyclestreets** function `journey()` returns data at the segment level which can be aggregated by grouping by origin and destination level data returned by the `route()` function in **stplanr** +- **Leg** level outputs contain a single feature and associated attributes each *mode* within each origin-destination (OD) pair, as described in Section \@ref(nodes). For trips only involving one mode (for example driving, from home to work, ignoring the short walk to the car), the leg is the same as the route: the car journey. For trips involving public transport, legs provide key information. The **r5r** function `detailed_itineraries()` returns legs which, confusingly, are sometimes referred to as 'segments' +- Segment-level outputs provide the most detailed information about routes, with records for each small section of the transport network. Typically segments are similar in length, or identical to, ways in OpenStreetMap. The **cyclestreets** function `journey()` returns data at the segment-level which can be aggregated by grouping by origin- and destination-level data returned by the `route()` function in **stplanr** Most routing engines return route level by default, although multi-modal engines generally provide outputs at the leg level (one feature per continuous movement by a single mode of transport). -Segment level outputs have the advantage of providing more detail. +Segment-level outputs have the advantage of providing more detail. The **cyclestreets** package returns multiple 'quietness' levels per route, enabling identification of the 'weakest link' in cycle networks. -Disadvantages of segment level outputs include increased file sizes and complexities associated with the extra detail. +Disadvantages of segment-level outputs include increased file sizes and complexities associated with the extra detail. -Route level results can be converted into segment level results using the function `stplanr::overline()` [@morgan_travel_2020]. +Route level results can be converted into segment-level results using the function `stplanr::overline()` [@morgan_travel_2020]. When working with segment or leg-level data, route-level statistics can be returned by grouping by columns representing trip start and end points and summarizing/aggregating columns containing segment-level data. ### In-memory routing with R {#memengine} @@ -586,7 +586,7 @@ Furthermore, the town is surrounded by large (cycling unfriendly) road structure There are many benefits of converting travel desire lines\index{desire lines} into routes. It is important to remember that we cannot be sure how many (if any) trips will follow the exact routes calculated by routing engines. -However, route and street/way/segment level results can be highly policy relevant. +However, route and street/way/segment-level results can be highly policy relevant. Route segment results can enable the prioritization of investment where it is most needed, according to available data [@lovelace_propensity_2017]. ## Route networks @@ -647,7 +647,7 @@ tm_shape(zones_od) + col = "red") ``` -Transport networks with records at the segment level, typically with attributes such as road type and width, constitute a common type of route network. +Transport networks with records at the segment-level, typically with attributes such as road type and width, constitute a common type of route network. Such route network datasets are available worldwide from OpenStreetMap, and can be downloaded with packages such as **osmdata**\index{osmdata (package)} and **osmextract**\index{osmextract (package)}. To save time downloading and preparing OSM\index{OpenStreetMap}, we will use the `bristol_ways` object from the **spDataLarge** package, an `sf` object with LINESTRING geometries and attributes representing a sample of the transport network in the case study region (see `?bristol_ways` for details), as shown in the output below: From 83ab92f23a82fe14fa24fe891859f8e2eb1a194d Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 19:59:37 +0100 Subject: [PATCH 05/14] Fixes for 13.6.2 --- 13-transport.Rmd | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index aa9850d16..b313dfe63 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -461,13 +461,13 @@ Routing engines can generate outputs at three geographic levels of routes, legs - **Leg** level outputs contain a single feature and associated attributes each *mode* within each origin-destination (OD) pair, as described in Section \@ref(nodes). For trips only involving one mode (for example driving, from home to work, ignoring the short walk to the car), the leg is the same as the route: the car journey. For trips involving public transport, legs provide key information. The **r5r** function `detailed_itineraries()` returns legs which, confusingly, are sometimes referred to as 'segments' - Segment-level outputs provide the most detailed information about routes, with records for each small section of the transport network. Typically segments are similar in length, or identical to, ways in OpenStreetMap. The **cyclestreets** function `journey()` returns data at the segment-level which can be aggregated by grouping by origin- and destination-level data returned by the `route()` function in **stplanr** -Most routing engines return route level by default, although multi-modal engines generally provide outputs at the leg level (one feature per continuous movement by a single mode of transport). +Most routing engines return route-level by default, although multi-modal engines generally provide outputs at the leg level (one feature per continuous movement by a single mode of transport). Segment-level outputs have the advantage of providing more detail. The **cyclestreets** package returns multiple 'quietness' levels per route, enabling identification of the 'weakest link' in cycle networks. Disadvantages of segment-level outputs include increased file sizes and complexities associated with the extra detail. -Route level results can be converted into segment-level results using the function `stplanr::overline()` [@morgan_travel_2020]. -When working with segment or leg-level data, route-level statistics can be returned by grouping by columns representing trip start and end points and summarizing/aggregating columns containing segment-level data. +Route-level results can be converted into segment-level results using the function `stplanr::overline()` [@morgan_travel_2020]. +When working with segment- or leg-level data, route-level statistics can be returned by grouping by columns representing trip start and end points and summarizing/aggregating columns containing segment-level data. ### In-memory routing with R {#memengine} @@ -601,7 +601,7 @@ Any transport research that involves route calculation requires a route network However, route networks are also important outputs in many transport research projects: summarizing data such as the potential number of trips made on particular segments and represented as a route network, can help prioritize investment where it is most needed. \index{network} -To demonstrate how to create route networks as an output derived from route level data, imagine a simple scenario of mode shift. +To demonstrate how to create route networks as an output derived from route-level data, imagine a simple scenario of mode shift. Imagine that 50% of car trips between 0 to 3 km in route distance are replaced by cycling, a percentage that drops by 10 percentage points for every additional km of route distance so that 20% of car trips of 6 km are replaced by cycling and no car trips that are 8 km or longer are replaced by cycling. This is of course an unrealistic scenario [@lovelace_propensity_2017], but is a useful starting point. In this case, we can model mode shift from cars to bikes as follows: From a9001afcd67a6cfa27a449480478957dc8265247 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 20:08:29 +0100 Subject: [PATCH 06/14] Complete proofing 13.6 --- 13-transport.Rmd | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index b313dfe63..701bdf0af 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -472,12 +472,12 @@ When working with segment- or leg-level data, route-level statistics can be retu ### In-memory routing with R {#memengine} Routing engines in R enable route networks stored as R objects *in memory* to be used as the basis of route calculation. -Options include [**sfnetworks**](https://luukvdmeer.github.io/sfnetworks/)\index{sfnetworks (package)}, [**dodgr**](https://urbananalyst.github.io/dodgr/) and [**cppRouting**](https://github.com/vlarmet/cppRouting) packages, each of which provide their own class system to represent route networks, the topic of the next section. +Options include [**sfnetworks**](https://luukvdmeer.github.io/sfnetworks/)\index{sfnetworks (package)}, [**dodgr**](https://urbananalyst.github.io/dodgr/) and [**cppRouting**](https://github.com/vlarmet/cppRouting) packages, each of which provide its own class system to represent route networks, the topic of the next section. While fast and flexible, native R routing options are generally harder to set up than dedicated routing engines for realistic route calculation. Routing is a hard problem and many hundreds of hours have been put into open source routing engines that can be downloaded and hosted locally. On the other hand, R-based routing engines may be well suited to model experiments and the statistical analysis of the impacts of changes on the network. -Changing route network characteristics (or weights associated with different route segment types), re-calculating routes, and analyzing results under many scenarios in a single language has benefits for research applications. +Changing route network characteristics (or weights associated with different route segment types), recalculating routes, and analyzing results under many scenarios in a single language have benefits for research applications. ### Locally hosted dedicated routing engines {#localengine} @@ -485,13 +485,13 @@ Changing route network characteristics (or weights associated with different rou These can be accessed from R with the packages **opentripplanner**, [**valhallr**](https://github.com/chris31415926535/valhallr), **r5r** and [**osrm**](https://github.com/riatelab/osrm) [@morgan_opentripplanner_2019; @pereira_r5r_2021]. Locally hosted routing engines run on the user's computer but in a process separate from R. They benefit from speed of execution and control over the weighting profile for different modes of transport. -Disadvantages include the difficulty of representing complex networks locally; temporal dynamics (primarily due to traffic); and the need for specialized external software. +Disadvantages include the difficulty of representing complex networks locally, lack of predefined routing profiles, temporal dynamics (e.g. traffic), and the need to install specialized software. ### Remotely hosted dedicated routing engines {#remoteengine} **Remotely hosted**\index{routing} routing engines use a web API\index{API} to send queries about origins and destinations and return results. Routing services based on open source routing engines, such as OSRM's publicly available service, work the same when called from R as locally hosted instances, simply requiring arguments specifying 'base URLs' to be updated. -However, the fact that external routing services are hosted on a dedicated machine (usually funded by commercial company with incentives to generate accurate routes) can give them advantages, including: +However, the fact that external routing services are hosted on a dedicated machine (usually funded by a commercial company with incentives to generate accurate routes) can give them advantages, including: - Provision of routing services worldwide (or usually at least over a large region) - Established routing services are usually updated regularly and can often respond to traffic levels @@ -505,7 +505,7 @@ While R users can access CycleStreets routes via the package [**cyclestreets**]( ### Contraction hierarchies and traffic assigment -Contraction hierarchies and traffic assignment are advanced but important topics in transport modeling worth being aware of, especially if you want your code to scale to large networks. +Contraction hierarchies and traffic assignment are advanced but important topics in transport modeling that are worth being aware of, especially if you want your code to scale to large networks. Calculating many routes is computationally resource intensive and can take hours, leading to the development of several algorithms to speed up routing calculations. **Contraction hierarchies** is a well-known algorithm that can lead to a substantial (1000x+ in some cases) speed up in routing tasks, depending on network size. Contraction hierarchies are used behind the scenes in the routing engines mentioned in the previous sections. @@ -518,9 +518,9 @@ This optimization problem can be solved by iterative algorithms which are implem ### Routing: A worked example -Instead of routing\index{routing} *all* desire lines generated in Section \@ref(desire-lines), we focus on a subset that is highly policy relevant. +Instead of routing\index{routing} *all* desire lines generated in Section \@ref(desire-lines), we focus on a subset that is highly policy-relevant. Running a computationally intensive operation on a subset before trying to process the whole dataset is often sensible, and this applies to routing. -Routing can be time and memory-consuming, resulting in large objects, due to the detailed geometries and extra attributes of route objects. +Routing can be time- and memory-consuming, resulting in large objects, due to the detailed geometries and extra attributes of route objects. We will therefore filter the desire lines before calculating routes in this section. Cycling is most beneficial when it replaces car trips. @@ -533,7 +533,7 @@ desire_lines_short = desire_lines |> filter(car_driver >= 100, distance_km <= 5, distance_km >= 2.5) ``` -In the code above `st_length()` calculated the length of each desire line, as described in Section \@ref(distance-relations). +In the code above, `st_length()` calculated the length of each desire line, as described in Section \@ref(distance-relations). The `filter()` function from **dplyr** filtered the `desire_lines` dataset based on the criteria outlined above\index{filter operation|see{attribute!subsetting}}, as described in Section \@ref(vector-attribute-subsetting). The next stage is to convert these desire lines into routes. This is done using the publicly available OSRM service with the **stplanr** functions `route()` and `route_osrm()`\index{stplanr (package)} in the code chunk below: @@ -551,7 +551,7 @@ Making the width of the routes proportional to the number of car journeys that c Figure \@ref(fig:routes) shows routes along which people drive short distances (see the github.com/geocompx for the source code).[^13-transport-8] [^13-transport-8]: Note that the red routes and black desire lines do not start at exactly the same points. - This is because zone centroids rarely lie on the route network: instead the routes originate from the transport network node nearest the centroid. + This is because zone centroids rarely lie on the route network; instead the routes originate from the transport network node nearest the centroid. Note also that routes are assumed to originate in the zone centroids, a simplifying assumption which is used in transport models to reduce the computational resources needed to calculate the shortest path between all combinations of possible origins and destinations [@hollander_transport_2016]. ```{r routes, warning=FALSE, fig.cap="Routes along which many (100+) short (<5km Euclidean distance) car journeys are made (red) overlaying desire lines representing the same trips (black) and zone centroids (dots).", fig.scap="Routes along which many car journeys are made.", echo=FALSE} @@ -586,7 +586,7 @@ Furthermore, the town is surrounded by large (cycling unfriendly) road structure There are many benefits of converting travel desire lines\index{desire lines} into routes. It is important to remember that we cannot be sure how many (if any) trips will follow the exact routes calculated by routing engines. -However, route and street/way/segment-level results can be highly policy relevant. +However, route and street/way/segment-level results can be highly policy-relevant. Route segment results can enable the prioritization of investment where it is most needed, according to available data [@lovelace_propensity_2017]. ## Route networks @@ -726,7 +726,7 @@ It may also be worth considering how the work could adapt to larger networks: te ## Prioritizing new infrastructure -This section demonstrates how geocomputation can create policy relevant outcomes in the field of transport planning. +This section demonstrates how geocomputation can create policy-relevant outcomes in the field of transport planning. We will identify promising locations for investment in sustainable transport infrastructure, using a simple approach for educational purposes. An advantage of the data driven approach outlined in this chapter is its modularity: each aspect can be useful on its own, and feed into wider analyses. From 3fc1c1bdc2360ac1c2f5ca55e53d472f50dce265 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 20:13:50 +0100 Subject: [PATCH 07/14] Tidy-up 13.7 --- 13-transport.Rmd | 14 +++++++------- code/chapters/13-transport.R | 2 +- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index 701bdf0af..08bcb6e72 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -594,7 +594,7 @@ Route segment results can enable the prioritization of investment where it is mo \index{network} While routes generally contain data on travel *behavior*, at the same geographic level as desire lines and OD pairs, route network datasets usually represent the physical transport network. Each *segment* in a route network roughly corresponds to a continuous section of street between junctions and appears only once, although the average length of segments depends on the data source (segments in the OSM-derived `bristol_ways` dataset used in this section have an average length of just over 200 m, with a standard deviation of nearly 500 m). -Variability in segment lengths can be explained by the fact that in some rural locations junctions are far apart while in dense urban areas there are crossings and other segment breaks every few meters. +Variability in segment lengths can be explained by the fact that in some rural locations junctions are far apart, while in dense urban areas there are crossings and other segment breaks every few meters. Route networks can be an input into, or an output of, transport data analysis projects, or both. Any transport research that involves route calculation requires a route network dataset in the internal or external routing engines (in the latter case the route network data is not necessarily imported into R). @@ -602,7 +602,7 @@ However, route networks are also important outputs in many transport research pr \index{network} To demonstrate how to create route networks as an output derived from route-level data, imagine a simple scenario of mode shift. -Imagine that 50% of car trips between 0 to 3 km in route distance are replaced by cycling, a percentage that drops by 10 percentage points for every additional km of route distance so that 20% of car trips of 6 km are replaced by cycling and no car trips that are 8 km or longer are replaced by cycling. +Imagine that 50% of car trips between 0 to 3 km in route distance are replaced by cycling, a percentage that drops by 10 percentage points for every additional kilometer of route distance so that 20% of car trips of 6 km are replaced by cycling and no car trips that are 8 km or longer are replaced by cycling. This is of course an unrealistic scenario [@lovelace_propensity_2017], but is a useful starting point. In this case, we can model mode shift from cars to bikes as follows: @@ -623,7 +623,7 @@ sum(routes_short_scenario$bicycle) - sum(routes_short$bicycle) Having created a scenario in which approximately 4000 trips have switched from driving to cycling, we can now model where this updated modeled cycling activity will take place. For this, we will use the function `overline()` from the **stplanr** package. -The function breaks linestrings at junctions (were two or more linestring geometries meet), and calculates aggregate statistics for each unique route segment [@morgan_travel_2020], taking an object containing routes and the names of the attributes to summarize as the first and second argument: +The function breaks linestrings at junctions (where two or more linestring geometries meet), and calculates aggregate statistics for each unique route segment [@morgan_travel_2020], taking an object containing routes and the names of the attributes to summarize as the first and second argument: ```{r rnet1} route_network_scenario = overline(routes_short_scenario, attrib = "bicycle") @@ -631,7 +631,7 @@ route_network_scenario = overline(routes_short_scenario, attrib = "bicycle") The outputs of the two preceding code chunks are summarized in Figure \@ref(fig:rnetvis) below. -```{r rnetvis, out.width="49%", fig.show='hold', fig.cap="Illustration of the percentage of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).", echo=FALSE, fig.height=9.5} +```{r rnetvis, out.width="49%", fig.show='hold', fig.cap="The percentage of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).", echo=FALSE, fig.height=9.5} routes_short_scenario |> ggplot() + geom_line(aes(distance / 1000, uptake), color = "red", linewidth = 3) + @@ -655,12 +655,12 @@ To save time downloading and preparing OSM\index{OpenStreetMap}, we will use the summary(bristol_ways) ``` -The output shows that `bristol_ways` represents just over 6 thousand segments on the transport network\index{network}. +The output shows that `bristol_ways` represents just over 6,000 segments on the transport network\index{network}. This and other geographic networks can be represented as mathematical graphs\index{graph}, with nodes\index{node} on the network, connected by edges\index{edge}. A number of R packages have been developed for dealing with such graphs, notably **igraph**\index{igraph (package)}. You can manually convert a route network into an `igraph` object, but the geographic attributes will be lost. -To overcome this limitation of **igraph**, the **sfnetworks**\index{sfnetworks (package)} package [@R-sfnetworks], which to represent route networks simultaneously as graphs *and* geographic lines, was developed. -We will demonstrate **sfnetworks** functionality on the `bristol_ways` object. +To overcome this limitation of **igraph**, the **sfnetworks**\index{sfnetworks (package)} package was developed [@R-sfnetworks]. +It represents networks simultaneously as graphs *and* geographic lines and has a 'tidy' syntax, as demonstrated in the following example. ```{r 13-transport-23} bristol_ways$lengths = st_length(bristol_ways) diff --git a/code/chapters/13-transport.R b/code/chapters/13-transport.R index 3ca512130..cbfa1ba3b 100644 --- a/code/chapters/13-transport.R +++ b/code/chapters/13-transport.R @@ -247,7 +247,7 @@ sum(routes_short_scenario$bicycle) - sum(routes_short$bicycle) route_network_scenario = overline(routes_short_scenario, attrib = "bicycle") -## ----rnetvis, out.width="49%", fig.show='hold', fig.cap="Illustration of the % of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).", echo=FALSE---- +## ----rnetvis, out.width="49%", fig.show='hold', fig.cap="The % of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).", echo=FALSE---- routes_short_scenario |> ggplot() + geom_line(aes(distance / 1000, uptake)) + From bf27969d84811ad1e036614f1cfc0172e272e960 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 20:16:24 +0100 Subject: [PATCH 08/14] 8 -> eight, not sure this is better --- 01-introduction.Rmd | 2 +- 13-transport.Rmd | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/01-introduction.Rmd b/01-introduction.Rmd index b992cd6f3..b282cdb3e 100644 --- a/01-introduction.Rmd +++ b/01-introduction.Rmd @@ -66,7 +66,7 @@ Building on this early definition, *Geocomputation with R* goes beyond data anal Our approach differs from early definitions of geocomputation in one important way, however: in its emphasis on reproducibility\index{reproducibility} and collaboration. At the turn of the 21^st^ Century, it was unrealistic to expect readers to be able to reproduce code examples, due to barriers preventing access to the necessary hardware, software and data. Fast-forward to today and things have progressed rapidly. -Anyone with access to a laptop with sufficient RAM (at least 8 GB recommended) can install and run software for geocomputation, and reproduce the contents of this book. +Anyone with access to a laptop with sufficient RAM (at least eight GB recommended) can install and run software for geocomputation, and reproduce the contents of this book. Financial and hardware barriers to geocomputation that existed in 1990s and early 2000s, when high-performance computers were too expensive for most people, have been removed.^[ A suitable laptop can be acquired second-hand for $100 or less in most countries today from websites such as [Ebay](https://www.ebay.com/sch/i.html?_from=R40&_nkw=laptop&_sacat=0&_oaa=1&_udhi=100&rt=nc&RAM%2520Size=4%2520GB%7C16%2520GB%7C8%2520GB&_dcat=177). Guidance on installing R and a suitable code editor is provided in Chapter \@ref(spatial-class). diff --git a/13-transport.Rmd b/13-transport.Rmd index 08bcb6e72..3927d9a8a 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -602,7 +602,7 @@ However, route networks are also important outputs in many transport research pr \index{network} To demonstrate how to create route networks as an output derived from route-level data, imagine a simple scenario of mode shift. -Imagine that 50% of car trips between 0 to 3 km in route distance are replaced by cycling, a percentage that drops by 10 percentage points for every additional kilometer of route distance so that 20% of car trips of 6 km are replaced by cycling and no car trips that are 8 km or longer are replaced by cycling. +Imagine that 50% of car trips between 0 to 3 km in route distance are replaced by cycling, a percentage that drops by 10 percentage points for every additional kilometer of route distance so that 20% of car trips of 6 km are replaced by cycling and no car trips that are eight km or longer are replaced by cycling. This is of course an unrealistic scenario [@lovelace_propensity_2017], but is a useful starting point. In this case, we can model mode shift from cars to bikes as follows: @@ -680,10 +680,10 @@ ways_sfn #> # … ``` -The output of the previous code chunk (with the final output shortened to contain only the most important 8 lines due to space considerations) shows that `ways_sfn` is a composite object, containing both nodes and edges in graph and spatial form. +The output of the previous code chunk (with the final output shortened to contain only the most important eight lines due to space considerations) shows that `ways_sfn` is a composite object, containing both nodes and edges in graph and spatial form. `ways_sfn` is of class `sfnetwork`, which builds on the `igraph` class from the **igraph** package. In the example below, the 'edge betweenness'\index{edge}, meaning the number of shortest paths\index{shortest route} passing through each edge, is calculated (see `?igraph::betweenness` for further details). -The output of the edge betweenness calculation is shown Figure \@ref(fig:wayssln), which has the cycle route network dataset calculated with the `overline()` function as an overlay for comparison. +The output of the edge betweenness calculation is shown in Figure \@ref(fig:wayssln), which has the cycle route network dataset calculated with the `overline()` function as an overlay for comparison. The results demonstrate that each graph edge represents a segment: the segments near the center of the road network have the highest betweenness values, whereas segments closer to central Bristol have higher cycling potential, based on these simplistic datasets. ```{r wayssln-gen} From 00f9fd3f232e444fc9b2108b92fe244edd0cfdd4 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 20:20:50 +0100 Subject: [PATCH 09/14] 13.7 done, start 13.8 --- 13-transport.Rmd | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index 3927d9a8a..75388ec56 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -692,7 +692,7 @@ ways_centrality = ways_sfn |> mutate(betweenness = tidygraph::centrality_edge_betweenness(lengths)) ``` -```{r wayssln, fig.cap="Illustration of route network datasets. The grey lines represent a simplified road network, with segment thickness proportional to betweenness. The green lines represent potential cycling flows (one way) calculated with the code above.", fig.scap="A small route network.", echo=FALSE} +```{r wayssln, fig.cap="Route network datasets. The gray lines represent a simplified road network, with segment thickness proportional to betweenness. The green lines represent potential cycling flows (one way) calculated with the code above.", fig.scap="A small route network.", echo=FALSE} bb_wayssln = tmaptools::bb(route_network_scenario, xlim = c(0.1, 0.9), ylim = c(0.1, 0.6), relative = TRUE) tm_shape(zones_od) + tm_fill(fill_alpha = 0.2, lwd = 0.1) + @@ -719,8 +719,7 @@ tm_shape(zones_od) + One can also find the shortest route\index{shortest route} between origins and destinations using this graph representation of the route network with the **sfnetworks** package. The methods presented in this section are relatively simple compared with what is possible. -The dual graph/spatial capabilities that **sfnetworks** enable many new powerful techniques can cannot be fully covered in this section. -This section does, however, provide a strong starting point for further exploration and research into the area. +The possibilities opened-up by **sfnetworks** cannot be fully covered in this section, but it does provide a strong starting point for further exploration and research into the area. A final point is that the example dataset we used above is relatively small. It may also be worth considering how the work could adapt to larger networks: testing methods on a subset of the data, and ensuring you have enough RAM will help, although it's also worth exploring other tools that can do transport network analysis that are optimized for large networks, such as R5 [@alessandretti_multimodal_2022]. @@ -729,7 +728,7 @@ It may also be worth considering how the work could adapt to larger networks: te This section demonstrates how geocomputation can create policy-relevant outcomes in the field of transport planning. We will identify promising locations for investment in sustainable transport infrastructure, using a simple approach for educational purposes. -An advantage of the data driven approach outlined in this chapter is its modularity: each aspect can be useful on its own, and feed into wider analyses. +An advantage of the data-driven approach outlined in this chapter is its modularity: each aspect can be useful on its own, and feed into wider analyses. The steps that got us to this stage included identifying short but car-dependent commuting routes (generated from desire lines) in Section \@ref(routes) and analysis of route network characteristics with the **sfnetworks** package in Section \@ref(route-networks). The final code chunk of this chapter combines these strands of analysis, by overlaying estimates of cycling potential from the previous section on top of a new dataset representing areas within a short distance of cycling infrastructure. This new dataset is created in the code chunk below which: 1) filters out the cycleway entities from the `bristol_ways` object representing the transport network; 2) 'unions' the individual LINESTRING entities of the cycleways into a single multilinestring object (for speed of buffering); and 3) creates a 100 m buffer around them to create a polygon. From dcd48349a5e0062a40473558aedc8b89da1a350c Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 20:30:44 +0100 Subject: [PATCH 10/14] Complete my proofing edits to c13 and the book! --- 13-transport.Rmd | 14 +++++++------- _13-ex.Rmd | 10 +++++----- code/chapters/13-transport.R | 2 +- 3 files changed, 13 insertions(+), 13 deletions(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index 75388ec56..aaffb05ee 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -581,7 +581,7 @@ tm_shape(zones_od) + ``` Visualizing the results in an interactive map shows that many short car trips take place in and around Bradley Stoke, around 10 km North of central Bristol. -It is easy to find explanations for the area's high level of car dependency: according to [Wikipedia](https://en.wikipedia.org/wiki/Bradley_Stoke), Bradley Stoke is "Europe's largest new town built with private investment", suggesting limited public transport provision. +It is easy to find explanations for the area's high level of car-dependency: according to [Wikipedia](https://en.wikipedia.org/wiki/Bradley_Stoke), Bradley Stoke is "Europe's largest new town built with private investment", suggesting limited public transport provision. Furthermore, the town is surrounded by large (cycling unfriendly) road structures, including the M4 and M5 motorways [@tallon_bristol_2007]. There are many benefits of converting travel desire lines\index{desire lines} into routes. @@ -731,7 +731,7 @@ We will identify promising locations for investment in sustainable transport inf An advantage of the data-driven approach outlined in this chapter is its modularity: each aspect can be useful on its own, and feed into wider analyses. The steps that got us to this stage included identifying short but car-dependent commuting routes (generated from desire lines) in Section \@ref(routes) and analysis of route network characteristics with the **sfnetworks** package in Section \@ref(route-networks). The final code chunk of this chapter combines these strands of analysis, by overlaying estimates of cycling potential from the previous section on top of a new dataset representing areas within a short distance of cycling infrastructure. -This new dataset is created in the code chunk below which: 1) filters out the cycleway entities from the `bristol_ways` object representing the transport network; 2) 'unions' the individual LINESTRING entities of the cycleways into a single multilinestring object (for speed of buffering); and 3) creates a 100 m buffer around them to create a polygon. +This new dataset is created in the code chunk below which: (1) filters out the cycleway entities from the `bristol_ways` object representing the transport network,(2) 'unions' the individual LINESTRING entities of the cycleways into a single multilinestring object (for speed of buffering),and (3) creates a 100 m buffer around them to create a polygon. ```{r 13-transport-25} existing_cycleways_buffer = bristol_ways |> @@ -767,7 +767,7 @@ route_network_no_infra = st_difference( ) ``` -The results of the preceding code chunks are shown in Figure \@ref(fig:cycleways), which shows routes with high levels of car dependency and high cycling potential but no cycleways. +The results of the preceding code chunks are shown in Figure \@ref(fig:cycleways), which shows routes with high levels of car-dependency and high cycling potential but no cycleways. ```{r 13-transport-28, eval=FALSE} tmap_mode("view") @@ -778,7 +778,7 @@ qtm(route_network_no_infra, basemaps = leaflet::providers$Esri.WorldTopoMap, -```{r cycleways, echo=FALSE, message=FALSE, fig.cap="Potential routes along which to prioritise cycle infrastructure in Bristol to reduce car dependency. The static map provides an overview of the overlay between existing infrastructure and routes with high car-bike switching potential (left). The screenshot the interactive map generated from the `qtm()` function highlights Whiteladies Road as somewhere that would benefit from a new cycleway (right).", out.width="50%", fig.show='hold', fig.scap="Routes along which to prioritise cycle infrastructure.", fig.height=9} +```{r cycleways, echo=FALSE, message=FALSE, fig.cap="Potential routes along which to prioritize cycle infrastructure in Bristol to reduce car-dependency. The static map provides an overview of the overlay between existing infrastructure and routes with high car-bike switching potential (left). The screenshot the interactive map generated from the `qtm()` function highlights Whiteladies Road as somewhere that would benefit from a new cycleway (right).", out.width="50%", fig.show='hold', fig.scap="Routes along which to prioritize cycle infrastructure.", fig.height=9} # Previous verson: # source("code/13-cycleways.R") tm_shape(existing_cycleways_buffer, bbox = bristol_region) + @@ -797,15 +797,15 @@ The analysis would need to be substantially expanded --- including with larger i ## Future directions of travel -This chapter provided a taste of the possibilities of using geocomputation for transport research, and explored some key geographic elements that make-up a city's transport system with open data and reproducible code. +This chapter provided a taste of the possibilities of using geocomputation for transport research, and it explored some key geographic elements that make-up a city's transport system with open data and reproducible code. The results could help plan where investment is needed. Transport systems operate at multiple interacting levels, meaning that geocomputational methods have great potential to generate insights into how they work, and the likely impacts of different interventions. There is much more that could be done in this area: it would be possible to build on the foundations presented in this chapter in many directions. -Transport is the fastest growing source of greenhouse gas emissions in many countries, and is set to become "the largest GHG emitting sector, especially in developed countries" (see [EURACTIV.com](https://www.euractiv.com/section/agriculture-food/opinion/transport-needs-to-do-a-lot-more-to-fight-climate-change/)). +Transport is the fastest growing source of greenhouse gas emissions in many countries, and it is set to become "the largest GHG emitting sector, especially in developed countries" (see [EURACTIV.com](https://www.euractiv.com/section/agriculture-food/opinion/transport-needs-to-do-a-lot-more-to-fight-climate-change/)). Transport-related emissions are unequally distributed across society but (unlike food and heating) are not essential for well-being. There is great potential for the sector to rapidly decarbonize through demand reduction, electrification of the vehicle fleet and the uptake of active travel modes such as walking and cycling. -New technologies can reduce car dependency by enabling more car sharing. +New technologies can reduce car-dependency by enabling more car sharing. 'Micro-mobility' systems such as dockless bike and e-scooter schemes are also emerging, creating valuable datasets in the General Bikeshare Feed Specification (GBFS) format, which can be imported and processed with the [**gbfs**](https://github.com/simonpcouch/gbfs) package. These and other changes will have large impacts on accessibility, the ability of people to reach employment and service locations that they need, something that can be quantified currently and under scenarios of change with packages such as [**accessibility**](https://ipeagit.github.io/accessibility/) packages. Further exploration of such 'transport futures' at local, regional and national levels could yield important new insights. diff --git a/_13-ex.Rmd b/_13-ex.Rmd index 7ca75f247..719e3399d 100644 --- a/_13-ex.Rmd +++ b/_13-ex.Rmd @@ -3,13 +3,13 @@ library(sf) library(spDataLarge) ``` -E1. In much of the analysis presented in the chapter we focused on active modes, but what about driving trips? +E1. In much of the analysis presented in the chapter, we focused on active modes, but what about driving trips? - What proportion of trips in the `desire_lines` object are made by driving? - What proportion of `desire_lines` have a straight line length of 5 km or more in distance? - What proportion of trips in desire lines that are longer than 5 km in length are made by driving? - Plot the desire lines that are both less than 5 km in length and along which more than 50% of trips are made by car. - - What do you notice about the location of these car dependent yet short desire lines? + - What do you notice about the location of these car-dependent yet short desire lines? ```{r 13-e1, eval=FALSE, echo=FALSE} sum(desire_lines$car_driver) / sum(desire_lines$all) @@ -35,7 +35,7 @@ desire_lines_5km_less_50_pct_driving |> tm_lines("Proportion driving") ``` -E2. What additional length of cycleways would result if all the routes presented in the last Figure, on sections beyond 100 m from existing cycleways, were constructed? +E2. What additional length of cycleways would be built if all the sections beyond 100 m from existing cycleways in Figure 13.8, were constructed? ```{r 13-transport-29, eval=FALSE, echo=FALSE} sum(st_length(route_network_no_infra)) @@ -63,10 +63,10 @@ If you were doing this for real, in government or for a transport consultancy, w # Include a higher proportion of trips in the analysis ``` -E5. Clearly, the routes identified in the last Figure only provide part of the picture. +E5. Clearly, the routes identified in Figure 13.8 only provide part of the picture. How would you extend the analysis? E6. Imagine that you want to extend the scenario by creating key *areas* (not routes) for investment in place-based cycling policies such as car-free zones, cycle parking points and reduced car parking strategy. How could raster\index{raster} datasets assist with this work? - - Bonus: develop a raster layer that divides the Bristol region into 100 cells (10 by 10) and estimate the average speed limit of roads in each, from the `bristol_ways` dataset (see Chapter \@ref(location)). + - Bonus: develop a raster layer that divides the Bristol region into 100 cells (10 x 10) and estimate the average speed limit of roads in each, from the `bristol_ways` dataset (see Chapter \@ref(location)). diff --git a/code/chapters/13-transport.R b/code/chapters/13-transport.R index cbfa1ba3b..971ed0841 100644 --- a/code/chapters/13-transport.R +++ b/code/chapters/13-transport.R @@ -335,7 +335,7 @@ route_network_no_infra = st_difference( ## lines.lwd = 5) -## ----cycleways, echo=FALSE, message=FALSE, fig.cap="Potential routes along which to prioritise cycle infrastructure in Bristol to reduce car dependency. The static map provides an overview of the overlay between existing infrastructure and routes with high car-bike switching potential (left). The screenshot the interactive map generated from the `qtm()` function highlights Whiteladies Road as somewhere that would benefit from a new cycleway (right).", out.width="50%", fig.show='hold', fig.scap="Routes along which to prioritise cycle infrastructure."---- +## ----cycleways, echo=FALSE, message=FALSE, fig.cap="Potential routes along which to prioritize cycle infrastructure in Bristol to reduce car-dependency. The static map provides an overview of the overlay between existing infrastructure and routes with high car-bike switching potential (left). The screenshot the interactive map generated from the `qtm()` function highlights Whiteladies Road as somewhere that would benefit from a new cycleway (right).", out.width="50%", fig.show='hold', fig.scap="Routes along which to prioritize cycle infrastructure."---- # Previous verson: # source("https://github.com/Robinlovelace/geocompr/raw/main/code/13-cycleways.R") # m_leaflet From 0834e6cac290c39a15453a240de4a7c41b68227b Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 23:13:27 +0100 Subject: [PATCH 11/14] Remove package references 1 --- 01-introduction.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/01-introduction.Rmd b/01-introduction.Rmd index b282cdb3e..c36d63d8c 100644 --- a/01-introduction.Rmd +++ b/01-introduction.Rmd @@ -298,7 +298,7 @@ R's spatial capabilities originated in early spatial packages in the S language The 1990s saw the development of numerous S scripts and a handful of packages for spatial statistics\index{statistics}. By the year 2000, there were R packages for various spatial methods, including "point pattern analysis, geostatistics, exploratory spatial data analysis and spatial econometrics" [@bivand_open_2000]. Some of these, notably **spatial**, **sgeostat** and **splancs** are still available on CRAN\index{CRAN} [@rowlingson_splancs_1993; @rowlingson_splancs_2017;@venables_modern_2002; @majure_sgeostat_2016]. -Key spatial packages were described in @ripley_spatial_2001, which outlined R packages for spatial smoothing and interpolation [@akima_akima_2016; @jr_geor_2016] and point pattern analysis [@rowlingson_splancs_2017; @baddeley_spatial_2015]. +Key spatial packages were described in @ripley_spatial_2001, which outlined R packages for spatial smoothing and interpolation and point pattern analysis. One of these (**spatstat**) is still being actively maintained, more than 20 years after its first release. A following commentary outlined the future prospects of spatial statistics [@bivand_more_2001], setting the stage for the development of the popular **spdep** package [@bivand_spdep_2017]. From 0d0a60060a351153ae2d5441e023e1b05eb8221e Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Sun, 29 Sep 2024 23:20:35 +0100 Subject: [PATCH 12/14] Add new mlr3 reference --- 12-spatial-cv.Rmd | 8 ++++---- 15-eco.Rmd | 2 +- code/chapters/12-spatial-cv.R | 2 +- geocompr.bib | 13 +++++++++++++ 4 files changed, 19 insertions(+), 6 deletions(-) diff --git a/12-spatial-cv.Rmd b/12-spatial-cv.Rmd index 5b7dce5aa..e17bf8d4a 100644 --- a/12-spatial-cv.Rmd +++ b/12-spatial-cv.Rmd @@ -270,11 +270,11 @@ There are dozens of packages for statistical learning\index{statistical learning Getting acquainted with each of these packages, including how to undertake cross-validation and hyperparameter\index{hyperparameter} tuning, can be a time-consuming process. Comparing model results from different packages can be even more laborious. The **mlr3** package and ecosystem was developed to address these issues. -It acts as a 'meta-package', providing a unified interface to popular supervised and unsupervised statistical learning techniques including classification, regression\index{regression}, survival analysis and clustering\index{clustering} [@lang_mlr3_2019; @becker_mlr3_2022]. +It acts as a 'meta-package', providing a unified interface to popular supervised and unsupervised statistical learning techniques including classification, regression\index{regression}, survival analysis and clustering\index{clustering} [@lang_mlr3_2019; @bischl_applied_2024]. The standardized **mlr3** interface is based on eight 'building blocks'. As illustrated in Figure \@ref(fig:building-blocks), these have a clear order. -(ref:building-blocks) Basic building blocks of the mlr3 package. Source: @becker_mlr3_2022. (Permission to reuse this figure was kindly granted.) +(ref:building-blocks) Basic building blocks of the mlr3 package. Source: @bischl_applied_2024. (Permission to reuse this figure was kindly granted.) ```{r building-blocks, echo=FALSE, fig.height=4, fig.width=4, fig.cap="(ref:building-blocks)", fig.scap="Basic building blocks of the mlr3 package."} knitr::include_graphics("images/12_ml_abstraction_crop.png") @@ -635,7 +635,7 @@ round(mean(score_spcv_svm$classif.auc), 2) It appears that the GLM\index{GLM} (aggregated AUROC\index{AUROC} was `r score[resampling_id == "repeated_spcv_coords" & learner_id == "classif.log_reg", round(mean(classif.auc), 2)]`) is slightly better than the SVM\index{SVM} in this specific case. To guarantee an absolute fair comparison, one should also make sure that the two models use the exact same partitions -- something we have not shown here but have silently used in the background (see `code/12_cv.R` in the book's GitHub repository for more information). -To do so, **mlr3** offers the functions `benchmark_grid()` and `benchmark()` [see also https://mlr3book.mlr-org.com/chapters/chapter3/evaluation_and_benchmarking.html#sec-benchmarking, @becker_mlr3_2022]. +To do so, **mlr3** offers the functions `benchmark_grid()` and `benchmark()` [see also https://mlr3book.mlr-org.com/chapters/chapter3/evaluation_and_benchmarking.html#sec-benchmarking, @bischl_applied_2024]. We will explore these functions in more detail in the Exercises. Please note also that using more than 50 iterations in the random search of the SVM would probably yield hyperparameters\index{hyperparameter} that result in models with a better AUROC [@schratz_hyperparameter_2019]. On the other hand, increasing the number of random search iterations would also increase the total number of models and thus runtime. @@ -658,7 +658,7 @@ Machine learning algorithms often require hyperparameter\index{hyperparameter} i Machine learning overall, and its use to understand spatial data, is a large field and this chapter has provided the basics, but there is more to learn. We recommend the following resources in this direction: -- The **mlr3 book** (@becker_mlr3_2022; https://mlr3book.mlr-org.com/) and especially the [chapter on the handling of spatiotemporal data](https://mlr3book.mlr-org.com/chapters/chapter13/beyond_regression_and_classification.html#sec-spatiotemporal) +- The **mlr3 book** (@bischl_applied_2024; https://mlr3book.mlr-org.com/) and especially the [chapter on the handling of spatiotemporal data](https://mlr3book.mlr-org.com/chapters/chapter13/beyond_regression_and_classification.html#sec-spatiotemporal) - An academic paper on hyperparameter\index{hyperparameter} tuning [@schratz_hyperparameter_2019] - An academic paper on how to use **mlr3spatiotempcv** [@schratz_mlr3spatiotempcv_2021] - In case of spatiotemporal data, one should account for spatial\index{autocorrelation!spatial} and temporal\index{autocorrelation!temporal} autocorrelation when doing CV\index{cross-validation} [@meyer_improving_2018] diff --git a/15-eco.Rmd b/15-eco.Rmd index 4e285c078..7daf63b06 100644 --- a/15-eco.Rmd +++ b/15-eco.Rmd @@ -489,7 +489,7 @@ search_space = paradox::ps( Having defined the search space, we are all set for specifying our tuning via the `AutoTuner()` function. Since we deal with geographic data, we will again make use of spatial cross-validation to tune the hyperparameters\index{hyperparameter} (see Sections \@ref(intro-cv) and \@ref(spatial-cv-with-mlr3)). Specifically, we will use a five-fold spatial partitioning with only one repetition (`rsmp()`). -In each of these spatial partitions, we run 50 models (`trm()`) while using randomly selected hyperparameter configurations (`tnr()`) within predefined limits (`seach_space`) to find the optimal hyperparameter\index{hyperparameter} combination [see also Section \@ref(svm) and https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html#sec-autotuner, @becker_mlr3_2022]. +In each of these spatial partitions, we run 50 models (`trm()`) while using randomly selected hyperparameter configurations (`tnr()`) within predefined limits (`seach_space`) to find the optimal hyperparameter\index{hyperparameter} combination [see also Section \@ref(svm) and https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html#sec-autotuner, @bischl_applied_2024]. The performance measure is the root mean squared error (RMSE\index{RMSE}). ```{r 15-eco-23} diff --git a/code/chapters/12-spatial-cv.R b/code/chapters/12-spatial-cv.R index a6ab86993..a1908a3c6 100644 --- a/code/chapters/12-spatial-cv.R +++ b/code/chapters/12-spatial-cv.R @@ -115,7 +115,7 @@ knitr::include_graphics("images/lsl-susc-1.png") knitr::include_graphics("images/13_partitioning.png") -## ----building-blocks, echo=FALSE, fig.height=4, fig.width=4, fig.cap="Basic building blocks of the mlr3 package. Source: @becker_mlr3_2022. (Permission to reuse this figure was kindly granted.)", fig.scap="Basic building blocks of the mlr3 package."---- +## ----building-blocks, echo=FALSE, fig.height=4, fig.width=4, fig.cap="Basic building blocks of the mlr3 package. Source: @bischl_applied_2024. (Permission to reuse this figure was kindly granted.)", fig.scap="Basic building blocks of the mlr3 package."---- knitr::include_graphics("images/13_ml_abstraction_crop.png") diff --git a/geocompr.bib b/geocompr.bib index 76f79a76a..af73233b9 100644 --- a/geocompr.bib +++ b/geocompr.bib @@ -2450,3 +2450,16 @@ @book{zuur_mixed_2009 langid = {english}, keywords = {nosource} } +@book{bischl_applied_2024, + title = {Applied {{Machine Learning Using}} Mlr3 in {{R}}}, + author = {Bischl, Bernd and Sonabend, Raphael and Kotthoff, Lars and Lang, Michel}, + date = {2024-01-18}, + eprint = {5wrsEAAAQBAJ}, + eprinttype = {googlebooks}, + publisher = {CRC Press}, + abstract = {mlr3 is an award-winning ecosystem of R packages that have been developed to enable state-of-the-art machine learning capabilities in R. Applied Machine Learning Using mlr3 in R gives an overview of flexible and robust machine learning methods, with an emphasis on how to implement them using mlr3 in R. It covers various key topics, including basic machine learning tasks, such as building and evaluating a predictive model; hyperparameter tuning of machine learning approaches to obtain peak performance; building machine learning pipelines that perform complex operations such as pre-processing followed by modelling followed by aggregation of predictions; and extending the mlr3 ecosystem with custom learners, measures, or pipeline components.Features: In-depth coverage of the mlr3 ecosystem for users and developers Explanation and illustration of basic and advanced machine learning concepts Ready to use code samples that can be adapted by the user for their application Convenient and expressive machine learning pipelining enabling advanced modelling Coverage of topics that are often ignored in other machine learning books The book is primarily aimed at researchers, practitioners, and graduate students who use machine learning or who are interested in using it. It can be used as a textbook for an introductory or advanced machine learning class that uses R, as a reference for people who work with machine learning methods, and in industry for exploratory experiments in machine learning.}, + isbn = {978-1-00-383057-3}, + langid = {english}, + pagetotal = {356}, + keywords = {Computers / Artificial Intelligence / General,Computers / Data Science / Machine Learning,Computers / Mathematical & Statistical Software,Mathematics / Probability & Statistics / General,Technology & Engineering / Automation,Technology & Engineering / Environmental / General} +} From 31807ba30dc3a05eace3fdca155fe52a684fb762 Mon Sep 17 00:00:00 2001 From: robinlovelace Date: Mon, 30 Sep 2024 15:20:12 +0100 Subject: [PATCH 13/14] Fix multi- typos --- 08-read-write-plot.Rmd | 6 +++--- 15-eco.Rmd | 6 +++--- _15-ex.Rmd | 2 +- code/chapters/15-eco.R | 2 +- code/chapters/_15-ex.R | 2 +- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/08-read-write-plot.Rmd b/08-read-write-plot.Rmd index bfdddc8d6..7dee03217 100644 --- a/08-read-write-plot.Rmd +++ b/08-read-write-plot.Rmd @@ -271,7 +271,7 @@ It is fast and flexible, but it may be worth looking at other packages such as * ### Raster data {#raster-data-read} \index{raster!data input} -Similar to vector data, raster data comes in many file formats with some supporting multi-layerfiles. +Similar to vector data, raster data comes in many file formats with some supporting multi-layer files. **terra**'s `rast()` command reads in a single layer when a file with just one layer is provided. ```{r 07-read-write-plot-24, message=FALSE} @@ -279,7 +279,7 @@ raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge") single_layer = rast(raster_filepath) ``` -It also works in case you want to read a multi-layerfile. +It also works in case you want to read a multi-layer file. ```{r 07-read-write-plot-25} multilayer_filepath = system.file("raster/landsat.tif", package = "spDataLarge") @@ -519,7 +519,7 @@ usa_sf = ne_countries(country = "United States of America", returnclass = "sf") Country borders can be also accessed with other packages, such as **geodata**, **giscoR**, or **rgeoboundaries**. A second example downloads a series of rasters containing global monthly precipitation sums with spatial resolution of 10 minutes (~18.5 km at the equator) using the **geodata** package [@R-geodata]. -The result is a multi-layerobject of class `SpatRaster`. +The result is a multi-layer object of class `SpatRaster`. ```{r 07-read-write-plot-5, eval=FALSE} library(geodata) diff --git a/15-eco.Rmd b/15-eco.Rmd index 7daf63b06..46f339b23 100644 --- a/15-eco.Rmd +++ b/15-eco.Rmd @@ -181,7 +181,7 @@ ep = qgisprocess::qgis_run_algorithm( ``` This returns a list named `ep` containing the paths to the computed output rasters. -Let's read in catchment area as well as catchment slope into a multi-layer`SpatRaster` object (see Section \@ref(raster-classes)). +Let's read in catchment area as well as catchment slope into a multi-layer `SpatRaster` object (see Section \@ref(raster-classes)). Additionally, we will add two more raster objects to it, namely `dem` and `ndvi`. ```{r 15-eco-7, eval=FALSE} @@ -191,7 +191,7 @@ ep = ep[c("AREA", "SLOPE")] |> rast() names(ep) = c("carea", "cslope") # assign better names origin(ep) = origin(dem) # make sure rasters have the same origin -ep = c(dem, ndvi, ep) # add dem and ndvi to the multi-layerSpatRaster object +ep = c(dem, ndvi, ep) # add dem and ndvi to the multi-layer SpatRaster object ``` Additionally, the catchment area\index{catchment area} values are highly skewed to the right (`hist(ep$carea)`). @@ -534,7 +534,7 @@ autotuner_rf$predict(task) ``` The `predict` method will apply the model to all observations used in the modeling. -Given a multi-layer`SpatRaster` containing rasters named as the predictors used in the modeling, `terra::predict()` will also make spatial distribution maps, i.e., predict to new data. +Given a multi-layer `SpatRaster` containing rasters named as the predictors used in the modeling, `terra::predict()` will also make spatial distribution maps, i.e., predict to new data. ```{r 15-eco-28, cache=TRUE, cache.lazy=FALSE, eval=FALSE} pred = terra::predict(ep, model = autotuner_rf, fun = predict) diff --git a/_15-ex.Rmd b/_15-ex.Rmd index e7f708ebc..f036fb5e0 100644 --- a/_15-ex.Rmd +++ b/_15-ex.Rmd @@ -90,7 +90,7 @@ ep = ep[c("AREA", "SLOPE")] |> names(ep) = c("carea", "cslope") # make sure all rasters share the same origin origin(ep) = origin(dem) -# add dem and ndvi to the multi-layerSpatRaster object +# add dem and ndvi to the multi-layer SpatRaster object ep = c(dem, ndvi, ep) ep$carea = log10(ep$carea) diff --git a/code/chapters/15-eco.R b/code/chapters/15-eco.R index 376ef21f0..d81295d90 100644 --- a/code/chapters/15-eco.R +++ b/code/chapters/15-eco.R @@ -118,7 +118,7 @@ knitr::include_graphics("images/15_sa_mongon_sampling.png") ## terra::rast() ## names(ep) = c("carea", "cslope") # assign proper names ## terra::origin(ep) = terra::origin(dem) # make sure rasters have the same origin -## ep = c(dem, ndvi, ep) # add dem and ndvi to the multi-layerSpatRaster object +## ep = c(dem, ndvi, ep) # add dem and ndvi to the multi-layer SpatRaster object ## ----15-eco-8, eval=FALSE--------------------------------------------------------------------------- diff --git a/code/chapters/_15-ex.R b/code/chapters/_15-ex.R index 830382a46..29391cea9 100644 --- a/code/chapters/_15-ex.R +++ b/code/chapters/_15-ex.R @@ -84,7 +84,7 @@ ep = ep[c("AREA", "SLOPE")] |> names(ep) = c("carea", "cslope") # make sure all rasters share the same origin origin(ep) = origin(dem) -# add dem and ndvi to the multi-layerSpatRaster object +# add dem and ndvi to the multi-layer SpatRaster object ep = c(dem, ndvi, ep) ep$carea = log10(ep$carea) From b8201acb0c57fc6dad9154a8ad117607b08d538c Mon Sep 17 00:00:00 2001 From: Jakub Date: Mon, 30 Sep 2024 16:22:39 +0200 Subject: [PATCH 14/14] Update 13-transport.Rmd --- 13-transport.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/13-transport.Rmd b/13-transport.Rmd index aaffb05ee..a20f3db3c 100644 --- a/13-transport.Rmd +++ b/13-transport.Rmd @@ -237,7 +237,7 @@ The data in `bristol_od`, however, simply ignores such trips: it is an 'intra-zo In the same way that OD datasets can be aggregated to the zone of origin, they can also be aggregated to provide information about destination zones. People tend to gravitate towards central places. This explains why the spatial distribution represented in the right panel in Figure \@ref(fig:zones) is relatively uneven, with the most common destination zones concentrated in Bristol city center. -The result is `zones_od`, which contains a new column reporting the number of trip destinations by any mode, and it iscreated as follows: +The result is `zones_od`, which contains a new column reporting the number of trip destinations by any mode, and it is created as follows: ```{r 13-transport-10} zones_destinations = bristol_od |>