Skip to content

Commit a14ebbd

Browse files
authored
additional first-to-third person pronoun changes (#352)
* me-to-us * change "I'd" to "we'd" * rephrase "I'm" to "we" * rephrase to avoid first-person "I'm not a fan" * change "I'm" to "we'll" * change "I'll" to "we'll" * a few more "I'll" to "we'll" changes * pluralise in the commented out bit just in case * rephrase to avoide awkward first person "I regret"
1 parent 2dbb8eb commit a14ebbd

File tree

8 files changed

+12
-12
lines changed

8 files changed

+12
-12
lines changed

collective-geoms.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -178,7 +178,7 @@ In the plot on the right, the "shaded bars" for each `class` have been construct
178178
179179
1. Install the babynames package. It contains data about the popularity of
180180
baby names in the US. Run the following code and fix the resulting graph.
181-
Why does this graph make me unhappy?
181+
Why does this graph make us unhappy?
182182
183183
```{r, eval = FALSE}
184184
library(babynames)

ext-springs.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ This gives us two parameters for our spring:
5252

5353
- The `tension`, how fast we move along x.
5454

55-
While I'm pretty sure this is not a physically correct parameterisation of a spring, it is good enough for us.
55+
Although we can be pretty sure this is not a physically correct parameterisation of a spring, it is good enough for us.
5656

5757
At this point, it's worthwhile to spend a little time thinking about how we might turn this into a geom.
5858
How will we specify the diameter?

getting-started.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -341,7 +341,7 @@ ggplot(mpg, aes(hwy)) +
341341
geom_freqpoly(binwidth = 1)
342342
```
343343

344-
An alternative to the frequency polygon is the density plot, `geom_density()`. I'm not a fan of density plots because they are harder to interpret since the underlying computations are more complex. They also make assumptions that are not true for all data, namely that the underlying distribution is continuous, unbounded, and smooth.
344+
An alternative to the frequency polygon is the density plot, `geom_density()`. A little care is required if you're using density plots: compared to frequency polygons they are harder to interpret since the underlying computations are more complex. They also make assumptions that are not true for all data, namely that the underlying distribution is continuous, unbounded, and smooth.
345345

346346
To compare the distributions of different subgroups, you can map a categorical variable to either fill (for `geom_histogram()`) or colour (for `geom_freqpoly()`). It's easier to compare distributions using the frequency polygon because the underlying perceptual task is easier. You can also use faceting: this makes comparisons a little harder, but it's easier to see the distribution of each group.
347347

introduction.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ There you will learn about how to control the theming system of ggplot2 and how
154154
Before we continue, make sure you have all the software you need for this book:
155155

156156
- **R**: If you don't have R installed already, you may be reading the wrong book; we assume a basic familiarity with R throughout this book.
157-
If you'd like to learn how to use R, I'd recommend my [*R for Data Science*](https://r4ds.had.co.nz/) which is designed to get you up and running with R with a minimum of fuss.
157+
If you'd like to learn how to use R, we'd recommend [*R for Data Science*](https://r4ds.had.co.nz/) which is designed to get you up and running with R with a minimum of fuss.
158158

159159
- **RStudio**: RStudio is a free and open source integrated development environment (IDE) for R.
160160
While you can write and use R code with any R environment (including R GUI and [ESS](http://ess.r-project.org)), RStudio has some nice features specifically for authoring and debugging your code.

layers.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ observations in the rows. This is a strong restriction, but there are good reaso
9696
* It enforces a clean separation of concerns: ggplot2 turns data frames into
9797
visualisations. Other packages can make data frames in the right format.
9898

99-
The data on each layer doesn't need to be the same, and it's often useful to combine multiple datasets in a single plot. To illustrate that idea I'm going to generate two new datasets related to the mpg dataset. First we'll fit a loess model and generate predictions from it. (This is what `geom_smooth()` does behind the scenes)
99+
The data on each layer doesn't need to be the same, and it's often useful to combine multiple datasets in a single plot. To illustrate that idea we'll generate two new datasets related to the mpg dataset. First we'll fit a loess model and generate predictions from it. (This is what `geom_smooth()` does behind the scenes)
100100

101101
```{r loess-pred}
102102
mod <- loess(hwy ~ displ, data = mpg)

maps.Rmd

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -77,10 +77,10 @@ The `coord_sf()` function governs the map projection, discussed in Section \@ref
7777

7878
### Layered maps
7979

80-
In some instances you may want to overlay one map on top of another. The ggplot2 package supports this by allowing you to add multiple `geom_sf()` layers to a plot. As an example, I'll use the `oz_states` data to draw the Australian states in different colours, and will overlay this plot with the boundaries of Australian electoral regions. To do this, there are two preprocessing steps to perform. First, we'll use `dplyr::filter()` to remove the "Other Territories" from the state boundaries.
80+
In some instances you may want to overlay one map on top of another. The ggplot2 package supports this by allowing you to add multiple `geom_sf()` layers to a plot. As an example, we'll use the `oz_states` data to draw the Australian states in different colours, and will overlay this plot with the boundaries of Australian electoral regions. To do this, there are two preprocessing steps to perform. First, we'll use `dplyr::filter()` to remove the "Other Territories" from the state boundaries.
8181

8282

83-
The code below draws a plot with two map layers: the first uses `oz_states` to fill the states in different colours, and the second uses `oz_votes` to draw the electoral boundaries. Second, I'll extract the electoral boundaries in a simplified form using the `ms_simplify()` function from the rmapshaper package [@rmapshaper]. This is generally a good idea if the original data set (in this case `ozmaps::abs_ced`) is stored at a higher resolution than your plot requires, in order to reduce the time taken to render the plot.
83+
The code below draws a plot with two map layers: the first uses `oz_states` to fill the states in different colours, and the second uses `oz_votes` to draw the electoral boundaries. Second, we'll extract the electoral boundaries in a simplified form using the `ms_simplify()` function from the rmapshaper package [@rmapshaper]. This is generally a good idea if the original data set (in this case `ozmaps::abs_ced`) is stored at a higher resolution than your plot requires, in order to reduce the time taken to render the plot.
8484

8585
`r columns(n = 1, aspect_ratio = 1)`
8686
```{r}
@@ -223,7 +223,7 @@ p + coord_sf(xlim = c(147.75, 150.25), ylim = c(-37.5, -34.5))
223223
p + coord_sf(xlim = c(150, 150.25), ylim = c(-36.3, -36))
224224
```
225225

226-
As this illustrates, Eden-Monaro is defined in terms of two distinct polygons, a large one on the Australian mainland and a small island. However, the large region has a hole in the middle (the hole exists because the Australian Capital Territory is a distinct political unit that is wholly contained within Eden-Monaro, and as illustrated above, electoral boundaries in Australia do not cross state lines). In sf terminology this is an example of a `MULTIPOLYGON` geometry. In this section I'll talk about the structure of these objects and how to work with them.
226+
As this illustrates, Eden-Monaro is defined in terms of two distinct polygons, a large one on the Australian mainland and a small island. However, the large region has a hole in the middle (the hole exists because the Australian Capital Territory is a distinct political unit that is wholly contained within Eden-Monaro, and as illustrated above, electoral boundaries in Australia do not cross state lines). In sf terminology this is an example of a `MULTIPOLYGON` geometry. In this section we'll talk about the structure of these objects and how to work with them.
227227

228228
First, let's use dplyr to grab only the geometry object:
229229

@@ -283,7 +283,7 @@ ggplot(dawson[-69]) +
283283

284284
A second way to supply geospatial information for mapping is to rely on **raster data**. Unlike the simple features format, in which geographical entities are specified in terms of a set of lines, points and polygons, rasters take the form of images. In the simplest case raster data might be nothing more than a bitmap file, but there are many different image formats out there. In the geospatial context specifically, there are image formats that include metadata (e.g., geodetic datum, coordinate reference system) that can be used to map the image information to the surface of the Earth. For example, one common format is GeoTIFF, which is a regular TIFF file with additional metadata supplied. Happily, most formats can be easily read into R with the assistance of GDAL (the Geospatial Data Abstraction Library, https://gdal.org/). For example the sf package contains a function `sf::gdal_read()` that provides access to the GDAL raster drivers from R. However, you rarely need to call this function directly, as there are other high level functions that take care of this for you.
285285

286-
As an illustration, suppose we wish to plot satellite images made publicly available by the Australian Bureau of Meterorology (BOM) on their FTP server. The bomrang package [@bomrang] provides a convenient interface to the server, including a `get_available_imagery()` function that returns a vector of filenames and a `get_satellite_imagery()` function that downloads a file and imports it directly into R. For expository purposes, however, I'll use a more flexible method that could be adapted to any FTP server, and use the `download.file()` function:
286+
As an illustration, suppose we wish to plot satellite images made publicly available by the Australian Bureau of Meterorology (BOM) on their FTP server. The bomrang package [@bomrang] provides a convenient interface to the server, including a `get_available_imagery()` function that returns a vector of filenames and a `get_satellite_imagery()` function that downloads a file and imports it directly into R. For expository purposes, however, we'll use a more flexible method that could be adapted to any FTP server, and use the `download.file()` function:
287287

288288
```{r eval=FALSE}
289289
# list of all file names with time stamp 2020-01-07 21:00 GMT
@@ -314,7 +314,7 @@ img_vis <- file.path("raster", "IDE00422.202001072100.tif")
314314
img_inf <- file.path("raster", "IDE00421.202001072100.tif")
315315
```
316316

317-
To import the data in the img_visible file into R, I'll use the stars package [@stars] to import the data as stars objects:
317+
To import the data in the img_visible file into R, we'll use the stars package [@stars] to import the data as stars objects:
318318

319319
```{r}
320320
library(stars)

scales-guides.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,7 @@ p1 <- base + scale_x_binned(breaks = seq(-50,50,10), limits = c(-50, 50))
255255
p2 <- base + scale_x_binned(breaks = seq(-50,50,10), limits = c(-50, 50), trans = "reverse")
256256
```
257257
258-
Binned scales can be transformed, much like continuous scales, but some care is required because the bins are constructed in the transformed space. In some cases this can produce undesirable outcomes. In the code below, I take a uniformly distributed variable and use `scale_x_binned()` and `geom_bar()` to construct a histogram of the logarithmically transformed data.
258+
Binned scales can be transformed, much like continuous scales, but some care is required because the bins are constructed in the transformed space. In some cases this can produce undesirable outcomes. In the code below, we take a uniformly distributed variable and use `scale_x_binned()` and `geom_bar()` to construct a histogram of the logarithmically transformed data.
259259
260260
`r columns(1, 1/2, 1)`
261261
```{r}

scales-position.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ base + coord_cartesian(ylim = c(10, 35)) # works as expected
9999
base + ylim(10, 35) # distorts the boxplot
100100
```
101101

102-
The only difference between the left and middle plots is that the latter is zoomed in. Some of the outlier points are not shown due to the restriction of the range, but the boxplots themselves remain identical. In contrast, in the plot on the right one of the boxplots has changed. When modifying the scale limits, all observations with highway mileage greater than 35 are converted to `NA` before the stat (in this case the boxplot) is computed. Because these "out of bounds" values are no longer available, the end result is that the sample median is shifted downward, which is almost never desirable behaviour. In hindsight, I regret this design choice as it is a common source of confusion for users. Unfortunately it would be very hard to change this default without breaking a lot of existing code.
102+
The only difference between the left and middle plots is that the latter is zoomed in. Some of the outlier points are not shown due to the restriction of the range, but the boxplots themselves remain identical. In contrast, in the plot on the right one of the boxplots has changed. When modifying the scale limits, all observations with highway mileage greater than 35 are converted to `NA` before the stat (in this case the boxplot) is computed. Because these "out of bounds" values are no longer available, the end result is that the sample median is shifted downward, which is almost never desirable behaviour. With the benefit of hindsight it's clear this wasn't a good design choice, because it is a common source of confusion for users. Unfortunately it would be very hard to change this default without breaking a lot of existing code.
103103

104104
You can learn more about coordinate systems in Section \@ref(cartesian). To learn more about how "out of bounds" values are handled for continuous and binned scales, see Section \@ref(oob).
105105

0 commit comments

Comments
 (0)