-
-
Notifications
You must be signed in to change notification settings - Fork 590
/
07-reproj.Rmd
879 lines (714 loc) · 55.6 KB
/
07-reproj.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
# Reprojecting geographic data {#reproj-geo-data}
```{r, include=FALSE}
source("code/before_script.R")
```
## Prerequisites {-}
- This chapter requires the following packages:
```{r 06-reproj-1, message=FALSE, warning=FALSE}
library(sf)
library(terra)
library(dplyr)
library(spData)
library(spDataLarge)
```
## Introduction {#reproj-intro}
Section \@ref(crs-intro) introduced coordinate reference systems (CRSs), with a focus on the two major types: *geographic* ('lon/lat', with units in degrees longitude and latitude) and *projected* (typically with units of meters from a datum) coordinate systems.
This chapter builds on that knowledge and goes further.
It demonstrates how to set and *transform* geographic data from one CRS to another and, furthermore, highlights specific issues that can arise due to ignoring CRSs that you should be aware of, especially if your data is stored with lon/lat coordinates.
\index{CRS!geographic}
\index{CRS!projected}
In many projects there is no need to worry about, let alone convert between, different CRSs.
Nonetheless, it is important to know if your data is in a projected or geographic CRS, and the consequences of this for geometry operations.
If you know this information, CRSs should *just work* behind the scenes: people often suddenly need to learn about CRSs when things go wrong.
Having a clearly defined CRS that all project data is in, and understanding how and why to use different CRSs, can ensure that things don't go wrong.
Furthermore, learning about coordinate systems will deepen your knowledge of geographic datasets and how to use them effectively.
This chapter teaches the fundamentals of CRSs, demonstrates the consequences of using different CRSs (including what can go wrong), and how to 'reproject' datasets from one coordinate system to another.
In the next section, we introduce CRSs in R, followed by Section \@ref(crs-setting) which shows how to get and set CRSs associated with spatial objects.
Section \@ref(geom-proj) demonstrates the importance of knowing what CRS your data is in with reference to a worked example of creating buffers.
We tackle questions of when to reproject and which CRS to use in Section \@ref(whenproject) and Section \@ref(which-crs), respectively.
Finally, we cover reprojecting vector and raster objects in Sections \@ref(reproj-vec-geom) and \@ref(reproj-ras) and modifying map projections in Section \@ref(mapproj).
## Coordinate reference systems {#crs-in-r}
\index{CRS!EPSG}
\index{CRS!WKT}
\index{CRS!proj-string}
Most modern geographic tools that require CRS conversions, including core R-spatial packages and desktop GIS software such as QGIS, interface with [PROJ](https://proj.org), an open source C++ library that "transforms coordinates from one coordinate reference system (CRS) to another".
CRSs can be described in many ways, including the following:
1. Simple yet potentially ambiguous statements such as "it's in lon/lat coordinates"
2. Formalized yet now outdated 'proj4 strings' (also known as 'proj-string') such as `+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs`
3. With an identifying 'authority:code' text string such as `EPSG:4326`
Each refers to the same thing: the 'WGS84' coordinate system that forms the basis of Global Positioning System (GPS) coordinates and many other datasets.
But which one is correct?
\index{CRS!EPSG}
The short answer is that the third way to identify CRSs is preferable: `EPSG:4326` is understood by **sf** (and by extension **stars**) and **terra** packages covered in this book, plus many other software projects for working with geographic data including [QGIS](https://docs.qgis.org/3.16/en/docs/user_manual/working_with_projections/working_with_projections.html) and [PROJ](https://proj.org/development/quickstart.html).
`EPSG:4326` is future-proof.
Furthermore, although it is machine readable, "EPSG:4326" is short, easy to remember and highly 'findable' online (searching for EPSG:4326 yields a dedicated page on the website [epsg.io](https://epsg.io/4326), for example).
The more concise identifier `4326` is understood by **sf**, but **we recommend the more explicit `AUTHORITY:CODE` representation to prevent ambiguity and to provide context**.
\index{CRS!WKT}
The longer answer is that none of the three descriptions are sufficient, and more detail is needed for unambiguous CRS handling and transformations: due to the complexity of CRSs, it is not possible to capture all relevant information about them in such short text strings.
For this reason, the Open Geospatial Consortium (OGC, which also developed the simple features specification that the **sf** package implements) developed an open standard format for describing CRSs that is called WKT (Well-Known Text).
This is detailed in a [100+ page document](https://portal.opengeospatial.org/files/18-010r7) that "defines the structure and content of a text string implementation of the abstract model for coordinate reference systems described in ISO 19111:2019" [@opengeospatialconsortium_wellknown_2019].
The WKT representation of the WGS84 CRS, which has the **identifier** `EPSG:4326` is as follows:
```{r}
st_crs("EPSG:4326")
```
\index{CRS!SRID}
The output of the command shows how the CRS identifier (also known as a Spatial Reference Identifier or [SRID](https://postgis.net/workshops/postgis-intro/projection.html)) works: it is simply a look-up, providing a unique identifier associated with a more complete WKT representation of the CRS.
This raises the question: what happens if there is a mismatch between the identifier and the longer WKT representation of a CRS?
On this point @opengeospatialconsortium_wellknown_2019 is clear, the verbose WKT representation takes precedence over the [identifier](https://docs.ogc.org/is/18-010r7/18-010r7.html#37):
> Should any attributes or values given in the cited identifier be in conflict with attributes or values given explicitly in the WKT description, the WKT values shall prevail.
\index{CRS!SRID}
The convention of referring to CRSs identifiers in the form `AUTHORITY:CODE`, which is also used by geographic software written in other [languages](https://jorisvandenbossche.github.io/blog/2020/02/11/geopandas-pyproj-crs/), allows a wide range of formally defined coordinate systems to be referred to.^[
Several other ways of referring to unique CRSs can be used, with five identifier types (EPSG code, PostGIS SRID, INTERNAL SRID, proj-string, and WKT strings) accepted by [QGIS](https://docs.qgis.org/3.16/en/docs/pyqgis_developer_cookbook/crs.html?highlight=srid) and other identifier types such as a more verbose variant of the `EPSG:4326` identifier, `urn:ogc:def:crs:EPSG::4326` [@opengeospatialconsortium_wellknown_2019].
]
The most commonly used authority in CRS identifiers is *EPSG*\index{CRS!EPSG}, an acronym for the European Petroleum Survey Group which published a standardized list of CRSs (the EPSG was [taken over](http://wiki.gis.com/wiki/index.php/European_Petroleum_Survey_Group) by the [Geomatics Committee of the International Association of Oil & Gas Producers](https://epsg.org/home.html) in 2005).
Other authorities can be used in CRS identifiers.
`ESRI:54030`, for example, refers to ESRI's implementation of the Robinson projection, which has the following WKT string (only first eight lines shown):
```{r, out.lines=8}
st_crs("ESRI:54030")
```
```{r, eval=FALSE, echo=FALSE}
sf::st_crs("urn:ogc:def:crs:EPSG::4326")
```
\index{CRS!WKT}
WKT strings are exhaustive, detailed, and precise, allowing for unambiguous CRSs storage and transformations.
They contain all relevant information about any given CRS, including its datum and ellipsoid, prime meridian, projection, and units.^[
Before the emergence of WKT CRS definitions, proj-string was the standard way to specify coordinate operations and store CRSs.
These string representations, built on a key=value form (e.g, `+proj=longlat +datum=WGS84 +no_defs`), have already been, or should in the future be, superseded by WKT representations in most cases.
]
\index{CRS!proj-string}
Recent PROJ versions (6+) still allow use of proj-strings to define coordinate operations, but some proj-string keys (`+nadgrids`, `+towgs84`, `+k`, `+init=epsg:`) are either no longer supported or are discouraged.
Additionally, only three datums (i.e., WGS84, NAD83, and NAD27) can be directly set in proj-string.
Longer explanations of the evolution of CRS definitions and the PROJ library can be found in @bivand_progress_2021, chapter 2 of @pebesma_spatial_2023, and a [blog post by Floris Vanderhaeghe, available at inbo.github.io/tutorials/tutorials/spatial_crs_coding/](https://inbo.github.io/tutorials/tutorials/spatial_crs_coding/).
Also, as outlined in the [PROJ documentation](https://proj.org/development/reference/cpp/cpp_general.html) there are different versions of the WKT CRS format including WKT1 and two variants of WKT2, the latter of which (WKT2, 2018 specification) corresponds to the ISO 19111:2019 [@opengeospatialconsortium_wellknown_2019].
## Querying and setting coordinate systems {#crs-setting}
\index{vector!CRS}
Let's look at how CRSs are stored in R spatial objects and how they can be queried and set.
First, we will look at getting and setting CRSs in **vector** geographic data objects, starting with the following example:
```{r 02-spatial-data-52, message=FALSE, results='hide'}
vector_filepath = system.file("shapes/world.gpkg", package = "spData")
new_vector = read_sf(vector_filepath)
```
Our new object, `new_vector`, is a data frame of class `sf` that represents countries worldwide (see the help page `?spData::world` for details).
The CRS can be retrieved with the **sf** function `st_crs()`.
```{r 02-spatial-data-53, eval=FALSE}
st_crs(new_vector) # get CRS
#> Coordinate Reference System:
#> User input: WGS 84
#> wkt:
#> ...
```
```{r, echo=FALSE, eval=FALSE}
# Aim: capture crs for comparison with updated CRS
new_vector_crs = st_crs(new_vector)
```
\index{vector!CRS}
The output is a list containing two main components:
1. `User input` (in this case `WGS 84`, a synonym for `EPSG:4326` which in this case was taken from the input file), corresponding to CRS identifiers described above
2. `wkt`, containing the full WKT string with all relevant information about the CRS
The `input` element is flexible, and depending on the input file or user input, can contain the `AUTHORITY:CODE` representation (e.g., `EPSG:4326`), the CRS's name (e.g., `WGS 84`), or even the proj-string definition.
The `wkt` element stores the WKT representation, which is used when saving the object to a file or doing any coordinate operations.
Above, we can see that the `new_vector` object has the WGS84 ellipsoid, uses the Greenwich prime meridian, and the latitude and longitude axis order.
In this case, we also have some additional elements, such as `USAGE` explaining the area suitable for the use of this CRS, and `ID` pointing to the CRS's identifier: `EPSG:4326`.
\index{vector!CRS}
The `st_crs` function also has one helpful feature -- we can retrieve some additional information about the used CRS.
For example, try to run:
- `st_crs(new_vector)$IsGeographic` to check if the CRS is geographic or not
- `st_crs(new_vector)$units_gdal` to find out the CRS units
- `st_crs(new_vector)$srid` to extract its 'SRID' identifier (when available)
- `st_crs(new_vector)$proj4string` to extract the proj-string representation
In cases when a CRS is missing or the wrong CRS is set, the `st_set_crs()` function can be used (in this case the WKT string remains unchanged because the CRS was already set correctly when the file was read-in):
```{r 02-spatial-data-54}
new_vector = st_set_crs(new_vector, "EPSG:4326") # set CRS
```
```{r, echo=FALSE, eval=FALSE}
waldo::compare(new_vector_crs, st_crs(new_vector))
# `old$input`: "WGS 84"
# `new$input`: "EPSG:4326"
```
\index{raster!CRS}
Getting and setting CRSs works in a similar way for raster geographic data objects.
The `crs()` function in the `terra` package accesses CRS information from a `SpatRaster` object (note the use of the `cat()` function to print it nicely).
```{r 02-spatial-data-55, out.lines=6}
raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge")
my_rast = rast(raster_filepath)
cat(crs(my_rast)) # get CRS
```
The output is the WKT string representation of CRS.
The same function, `crs()`, can be also used to set a CRS for raster objects.
```{r 02-spatial-data-56}
crs(my_rast) = "EPSG:26912" # set CRS
```
Here, we can use either the identifier (recommended in most cases) or complete WKT string representation.
Alternative methods to set `crs` include proj-string strings or CRSs extracted from other existing objects with `crs()`, although these approaches may be less future-proof.
Importantly, the `st_crs()` and `crs()` functions do not alter coordinates' values or geometries.
Their role is only to set a metadata information about the object CRS.
In some cases the CRS of a geographic object is unknown, as is the case in the `london` dataset created in the code chunk below, building on the example of London introduced in Section \@ref(vector-data):
```{r 06-reproj-2}
london = data.frame(lon = -0.1, lat = 51.5) |>
st_as_sf(coords = c("lon", "lat"))
st_is_longlat(london)
```
The output `NA` shows that **sf** does not know what the CRS is and is unwilling to guess (`NA` literally means 'not available').
Unless a CRS is manually specified or is loaded from a source that has CRS metadata, **sf** does not make any explicit assumptions about which coordinate systems, other than to say "I don't know".
This behavior makes sense given the diversity of available CRSs but differs from some approaches, such as the GeoJSON file format specification, which makes the simplifying assumption that all coordinates have a lon/lat CRS: `EPSG:4326`.
Datasets without a specified CRS can cause problems: all geographic coordinates have a coordinate reference system, and software can only make good decisions around plotting and geometry operations if it knows what type of CRS it is working with.
Thus, again, it is important to always check the CRS of a dataset and to set it if it is missing.
```{r 06-reproj-3}
london_geo = st_set_crs(london, "EPSG:4326")
st_is_longlat(london_geo)
```
## Geometry operations on projected and unprojected data {#geom-proj}
Since **sf** version 1.0.0, R's ability to work with geographic vector datasets that have lon/lat CRSs has improved substantially, thanks to its integration with the S2 *spherical geometry engine* introduced in Section \@ref(s2).
As shown in Figure \@ref(fig:s2geos), **sf** uses either GEOS\index{GEOS} or the S2\index{S2} depending on the type of CRS and whether S2 has been disabled (it is enabled by default).^[The `st_area()` function is an exception, as it uses the **lwgeom**'s `st_geod_area()` function to calculate areas for data with geographic CRSs when `sf_use_s2()` is disabled.]
GEOS is always used for projected data and data with no CRS; for geographic data, S2 is used by default but can be disabled with `sf::sf_use_s2(FALSE)`.
```{r s2geos, fig.cap="Behavior of the geometry operations in the sf package depending on the input data's CRS.", echo=FALSE}
'digraph G3 {
layout=dot
rankdir=TB
node [shape = rectangle];
rec1 [label = "Spatial data" shape = oval];
rec2 [label = "Geographic CRS\n " shape = cds];
rec3 [label = "Projected CRS\nor CRS is missing" shape = cds]
rec4 [label = "S2 enabled\n(default)" shape = diamond]
rec5 [label = "S2 disabled\n " shape = diamond]
rec6 [label = "sf uses s2library for \ngeometry operations" center = true];
rec7 [label = "sf uses GEOS for \ngeometry operations" center = true];
rec8 [label = "Result" shape = oval weight=100];
rec9 [label = "Result" shape = oval weight=100];
rec1 -> rec2;
rec1 -> rec3;
rec2 -> rec4;
rec2 -> rec5;
rec3 -> rec7;
rec4 -> rec6;
rec5 -> rec7;
rec6 -> rec8;
rec7 -> rec9;
}' -> s2geos
# # exported manually; the code below returns a low res version of png
# tmp = DiagrammeR::grViz(s2geos)
# htmlwidgets::saveWidget(widget = tmp, file = "images/07-s2geos.html")
# # tmp
# tmp = DiagrammeRsvg::export_svg(tmp)
# library(htmltools)
# html_print(HTML(tmp))
# tmp = charToRaw(tmp)
# # rsvg::rsvg_png(tmp, "images/07-s2geos.png")
# webshot::webshot(url = "images/07-s2geos.html", file = "images/07-s2geos.png", vwidth = 800, vheight = 600)
# download.file(
# "https://user-images.githubusercontent.com/1825120/188572856-7946ae32-98de-444c-9f48-b1d7afcf9345.png",
# destfile = "images/07-s2geos.png"
# )
# browseURL("images/07-s2geos.png")
knitr::include_graphics("images/07-s2geos.png")
```
To demonstrate the importance of CRSs, we will create a buffer of 100 km around the `london` object from the previous section.
We will also create a deliberately faulty buffer with a 'distance' of 1 degree, which is roughly equivalent to 100 km (1 degree is about 111 km at the equator).
Before diving into the code, it may be worth skipping briefly ahead to peek at Figure \@ref(fig:crs-buf) to get a visual handle on the outputs that you should be able to reproduce by following the code chunks below.
The first stage is to create three buffers around the `london` and `london_geo` objects created above with boundary distances of 1 degree and 100 km (or 100,000 m, which can be expressed as `1e5` in scientific notation) from central London:
```{r 06-reproj-4-1}
london_buff_no_crs = st_buffer(london, dist = 1) # incorrect: no CRS
london_buff_s2 = st_buffer(london_geo, dist = 100000) # silent use of s2
london_buff_s2_100_cells = st_buffer(london_geo, dist = 100000, max_cells = 100)
```
In the first line above, **sf** assumes that the input is projected and generates a result that has a buffer in units of degrees, which is problematic, as we will see.
In the second line, **sf** silently uses the spherical geometry engine S2, introduced in Chapter \@ref(spatial-class), to calculate the extent of the buffer using the default value of `max_cells = 1000` --- set to `100` in line three --- the consequences which will become apparent shortly.
To highlight the impact of **sf**'s use of the S2\index{S2} geometry engine for unprojected (geographic) coordinate systems, we will temporarily disable it with the command `sf_use_s2()` (which is on, `TRUE`, by default), in the code chunk below.
Like `london_buff_no_crs`, the new `london_geo` object is a geographic abomination: it has units of degrees, which makes no sense in the vast majority of cases:
```{r 06-reproj-4-2}
sf::sf_use_s2(FALSE)
london_buff_lonlat = st_buffer(london_geo, dist = 1) # incorrect result
sf::sf_use_s2(TRUE)
```
The warning message above hints at issues with performing planar geometry operations on lon/lat data.
When spherical geometry operations are turned off, with the command `sf::sf_use_s2(FALSE)`, buffers (and other geometric operations) may result in worthless outputs because they use units of latitude and longitude, a poor substitute for proper units of distances such as meters.
```{block2 06-reproj-5, type="rmdnote"}
The distance between two lines of longitude, called meridians\index{meridians}, is around 111 km at the equator (execute `geosphere::distGeo(c(0, 0), c(1, 0))` to find the precise distance).
This shrinks to zero at the poles.
At the latitude of London, for example, meridians are less than 70 km apart (challenge: execute code that verifies this).
<!-- `geosphere::distGeo(c(0, 51.5), c(1, 51.5))` -->
Lines of latitude, by contrast, are equidistant from each other irrespective of latitude: they are always around 111 km apart, including at the equator and near the poles (see Figures \@ref(fig:crs-buf) to \@ref(fig:wintriproj)).
```
Do not interpret the warning about the geographic (`longitude/latitude`) CRS as "the CRS should not be set": it almost always should be!
It is better understood as a suggestion to *reproject* the data onto a projected CRS.
This suggestion does not always need to be heeded: performing spatial and geometric operations makes little or no difference in some cases (e.g., spatial subsetting).
But for operations involving distances such as buffering, the only way to ensure a good result (without using spherical geometry engines) is to create a projected copy of the data and run the operation on that.
This is done in the code chunk below.
```{r 06-reproj-6}
london_proj = data.frame(x = 530000, y = 180000) |>
st_as_sf(coords = c("x", "y"), crs = "EPSG:27700")
```
The result is a new object that is identical to `london`, but created using a suitable CRS (the British National Grid, which has an EPSG code of 27700 in this case) that has units of meters.
We can verify that the CRS has changed using `st_crs()` as follows (some of the output has been replaced by `...,`):
```{r 06-reproj-7, out.lines=8}
st_crs(london_proj)
```
Notable components of this CRS description include the EPSG code (`EPSG: 27700`) and the detailed `wkt` string (only the first five lines of which are shown).^[
For a short description of the most relevant projection parameters and related concepts, see the fourth lecture by Jochen Albrecht hosted at
http://www.geography.hunter.cuny.edu/~jochen/GTECH361/lectures/ and information at https://proj.org/usage/projections.html.
]
The fact that the units of the CRS, described in the LENGTHUNIT field, are meters (rather than degrees) tells us that this is a projected CRS: `st_is_longlat(london_proj)` now returns `FALSE` and geometry operations on `london_proj` will work without a warning.
Buffer operations on the `london_proj` will use GEOS, and results will be returned with proper units of distance.
The following line of code creates a buffer around *projected* data of exactly 100 km:
```{r 06-reproj-8}
london_buff_projected = st_buffer(london_proj, 100000)
```
The geometries of the three `london_buff*` objects created in the preceding code that *have* a specified CRS (`london_buff_s2`, `london_buff_lonlat` and `london_buff_projected`) are illustrated in Figure \@ref(fig:crs-buf).
```{r crs-buf-old, include=FALSE, eval=FALSE}
#| message: FALSE
#| warning: FALSE
#| results: hide
uk = rnaturalearth::ne_countries(scale = 50, returnclass = "sf") |>
filter(grepl(pattern = "United Kingdom|Ire", x = name_long))
plot(london_buff_s2, graticule = st_crs(4326), axes = TRUE, reset = FALSE, lwd = 2)
plot(london_buff_s2_100_cells, lwd = 9, add = TRUE)
plot(st_geometry(uk), add = TRUE, border = "gray", lwd = 3)
uk_proj = uk |>
st_transform("EPSG:27700")
plot(london_buff_projected, graticule = st_crs("EPSG:27700"), axes = TRUE, reset = FALSE, lwd = 2)
plot(london_proj, add = TRUE)
plot(st_geometry(uk_proj), add = TRUE, border = "gray", lwd = 3)
plot(london_buff_lonlat, graticule = st_crs("EPSG:27700"), axes = TRUE, reset = FALSE, lwd = 2)
plot(london_proj, add = TRUE)
plot(st_geometry(uk), add = TRUE, border = "gray", lwd = 3)
```
```{r crs-buf, fig.cap="Buffers around London showing results created with the S2 spherical geometry engine on lon/lat data (left), projected data (middle) and lon/lat data without using spherical geometry (right). The left plot illustrates the result of buffering unprojected data with sf, which calls Google's S2 spherical geometry engine by default with max cells set to 1000 (thin line). The thick, blocky line illustrates the result of the same operation with max cells set to 100.", fig.scap="Buffers around London with a geographic and projected CRS.", echo=FALSE, fig.asp=0.39, fig.width = 8, warning = FALSE, message=FALSE}
uk = rnaturalearth::ne_countries(scale = 50, returnclass = "sf") |>
filter(grepl(pattern = "United Kingdom|Ire", x = name_long))
library(tmap)
tm1 = tm_shape(london_buff_s2, bbox = st_bbox(london_buff_s2_100_cells)) +
tm_graticules(lwd = 0.2) +
tm_borders(col = "black", lwd = 0.5) +
tm_shape(london_buff_s2_100_cells) +
tm_borders(col = "black", lwd = 1.5) +
tm_shape(uk) +
tm_polygons(lty = 3, fill_alpha = 0.2, fill = "#567D46") +
tm_shape(london_proj) +
tm_symbols()
tm2 = tm_shape(london_buff_projected, bbox = st_bbox(london_buff_s2_100_cells)) +
tm_grid(lwd = 0.2) +
tm_borders(col = "black", lwd = 0.5) +
tm_shape(uk) +
tm_polygons(lty = 3, fill_alpha = 0.2, fill = "#567D46") +
tm_shape(london_proj) +
tm_symbols()
tm3 = tm_shape(london_buff_lonlat, bbox = st_bbox(london_buff_s2_100_cells)) +
tm_graticules(lwd = 0.2) +
tm_borders(col = "black", lwd = 0.5) +
tm_shape(uk) +
tm_polygons(lty = 3, fill_alpha = 0.2, fill = "#567D46") +
tm_shape(london_proj) +
tm_symbols()
tmap_arrange(tm1, tm2, tm3, nrow = 1)
```
It is clear from Figure \@ref(fig:crs-buf) that buffers based on `s2` and properly projected CRSs are not 'squashed', meaning that every part of the buffer boundary is equidistant to London.
The results that are generated from lon/lat CRSs when `s2` is *not* used, either because the input lacks a CRS or because `sf_use_s2()` is turned off, are heavily distorted, with the result elongated in the north-south axis, highlighting the dangers of using algorithms that assume projected data on lon/lat inputs (as GEOS does).
The results generated using S2\index{S2} are also distorted, however, although less dramatically.
Both buffer boundaries in Figure \@ref(fig:crs-buf) (left) are jagged, although this may only be apparent or relevant for the thick boundary representing a buffer created with the `s2` argument `max_cells` set to 100.
The lesson is that results obtained from lon/lat data via S2 will be different from results obtained from using projected data.
The difference between S2\index{S2} derived buffers and GEOS\index{GEOS} derived buffers on projected data reduce as the value of `max_cells` increases: the 'right' value for this argument may depend on many factors and the default value 1000 is often a reasonable default.
When choosing `max_cells` values, speed of computation should be balanced against resolution of results.
In situations where smooth curved boundaries are advantageous, transforming to a projected CRS before buffering (or performing other geometry operations) may be appropriate.
The importance of CRSs (primarily whether they are projected or geographic) and the impacts of **sf**'s default setting to use S2 for buffers on lon/lat data is clear from the example above.
The subsequent sections go into more depth, exploring which CRS to use when projected CRSs *are* needed and the details of reprojecting vector and raster objects.
## When to reproject? {#whenproject}
\index{CRS!reprojection}
The previous section showed how to set the CRS manually, with `st_set_crs(london, "EPSG:4326")`.
In real-world applications, however, CRSs are usually set automatically when data is read-in.
In many projects the main CRS-related task is to *transform* objects, from one CRS into another.
But when should data be transformed?
And into which CRS?
There are no clear-cut answers to these questions and CRS selection always involves trade-offs [@maling_coordinate_1992].
However, there are some general principles provided in this section that can help you decide.
First it's worth considering *when to transform*.
In some cases transformation to a geographic CRS is essential, such as when publishing data online with the **leaflet** package.
Another case is when two objects with different CRSs must be compared or combined, as shown when we try to find the distance between two `sf` objects with different CRSs:
```{r 06-reproj-9, eval=FALSE}
st_distance(london_geo, london_proj)
# > Error: st_crs(x) == st_crs(y) is not TRUE
```
To make the `london` and `london_proj` objects geographically comparable, one of them must be transformed into the CRS of the other.
But which CRS to use?
The answer depends on context: many projects, especially those involving web mapping, require outputs in EPSG:4326, in which case it is worth transforming the projected object.
If, however, the project requires planar geometry operations rather than spherical geometry operations engine (e.g., to create buffers with smooth edges), it may be worth transforming data with a geographic CRS into an equivalent object with a projected CRS, such as the British National Grid (EPSG:27700).
That is the subject of Section \@ref(reproj-vec-geom).
## Which CRS to use? {#which-crs}
\index{CRS!reprojection}
\index{projection!World Geodetic System}
The question of *which CRS to use* is tricky, and there is rarely a 'right' answer:
"There exist no all-purpose projections, all involve distortion when far from the center of the specified frame" [@bivand_applied_2013].
Additionally, you should not be attached just to one projection for every task.
It is possible to use one projection for some part of the analysis, another projection for a different part, and even some other for visualization.
Always try to pick the CRS that serves your goal best!
When selecting **geographic CRSs**\index{CRS!geographic}, the answer is often [WGS84](https://en.wikipedia.org/wiki/World_Geodetic_System#A_new_World_Geodetic_System:_WGS_84).
It is used not only for web mapping, but also because GPS datasets and thousands of raster and vector datasets are provided in this CRS by default.
WGS84 is the most common CRS in the world, so it is worth knowing its EPSG code: 4326.^[
Instead of `"EPSG:4326"`, you may also use `"OGC:CRS84"`. The former assumes that latitude is always ordered before longitude, while the latter is the standard representation used by GeoJSON, with coordinates ordered longitude before latitude.]
This 'magic number' can be used to convert objects with unusual projected CRSs into something that is widely understood.
What about when a **projected CRS**\index{CRS!projected} is required?
In some cases, it is not something that we are free to decide:
"often the choice of projection is made by a public mapping agency" [@bivand_applied_2013].
This means that when working with local data sources, it is likely preferable to work with the CRS in which the data was provided, to ensure compatibility, even if the official CRS is not the most accurate.
The example of London was easy to answer because (a) the British National Grid (with its associated EPSG code 27700) is well known and (b) the original dataset (`london`) already had that CRS.
\index{UTM}
A commonly used default is Universal Transverse Mercator ([UTM](https://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system)), a set of CRSs that divides the Earth into 60 longitudinal wedges and 20 latitudinal segments.
Almost every place on Earth has a UTM code, such as "60H" which refers to northern New Zealand where R was invented.
UTM EPSG codes run sequentially from 32601 to 32660 for northern hemisphere locations and from 32701 to 32760 for southern hemisphere locations.
To show how the system works, let's create a function, `lonlat2UTM()` to calculate the EPSG code associated with any point on the planet as [follows](https://stackoverflow.com/a/9188972/):
```{r 06-reproj-13}
lonlat2UTM = function(lonlat) {
utm = (floor((lonlat[1] + 180) / 6) %% 60) + 1
if (lonlat[2] > 0) {
utm + 32600
} else{
utm + 32700
}
}
```
The following command uses this function to identify the UTM zone and associated EPSG code for Auckland and London:
```{r 06-reproj-14, echo=FALSE, eval=FALSE}
stplanr::geo_code("Auckland")
```
```{r 06-reproj-15}
lonlat2UTM(c(174.7, -36.9))
lonlat2UTM(st_coordinates(london))
```
The transverse Mercator projection used by UTM CRSs is conformal but distorts areas and distances with increasing severity with distance from the center of the UTM zone.
Documentation from the GIS software Manifold therefore suggests restricting the longitudinal extent of projects using UTM zones to 6 degrees from the central meridian ([manifold.net](https://manifold.net/doc/mfd9/universal_transverse_mercator_projection.htm)).
Therefore, we recommend using UTM only when your focus is on preserving angles for a relatively small area!
Currently, we also have tools helping us to select a proper CRS, which includes the **crsuggest** package (@R-crsuggest).
The main function in this package, `suggest_crs()`, takes a spatial object with geographic CRS and returns a list of possible projected CRSs that could be used for the given area.^[This package also allows to figure out the true CRS of the data without any CRS information attached.]
Another helpful tool is the webpage https://jjimenezshaw.github.io/crs-explorer/ that lists CRSs based on selected location and type.
Important note: while these tools are helpful in many situations, you need to be aware of the properties of the recommended CRS before you apply it.
\index{CRS!custom}
In cases where an appropriate CRS is not immediately clear, the choice of CRS should depend on the properties that are most important to preserve in the subsequent maps and analysis.
CRSs are either equal-area, equidistant, conformal (with shapes remaining unchanged), or some combination of compromises of those (Section \@ref(projected-coordinate-reference-systems)).
Custom CRSs with local parameters can be created for a region of interest and multiple CRSs can be used in projects when no single CRS suits all tasks.
'Geodesic calculations' can provide a fall-back if no CRSs are appropriate (see [proj.org/geodesic.html](https://proj.org/geodesic.html)).
Regardless of the projected CRS used, the results may not be accurate for geometries covering hundreds of kilometers.
\index{CRS!custom}
When deciding on a custom CRS, we recommend the following:^[
Many thanks to an anonymous reviewer whose comments formed the basis of this advice.
]
\index{projection!Lambert azimuthal equal-area}
\index{projection!Azimuthal equidistant}
\index{projection!Lambert conformal conic}
\index{projection!Stereographic}
\index{projection!Universal Transverse Mercator}
- A Lambert azimuthal equal-area ([LAEA](https://en.wikipedia.org/wiki/Lambert_azimuthal_equal-area_projection)) projection for a custom local projection (set latitude and longitude of origin to the center of the study area), which is an equal-area projection at all locations but distorts shapes beyond thousands of kilometers
- Azimuthal equidistant ([AEQD](https://en.wikipedia.org/wiki/Azimuthal_equidistant_projection)) projections for a specifically accurate straight-line distance between a point and the center point of the local projection
- Lambert conformal conic ([LCC](https://en.wikipedia.org/wiki/Lambert_conformal_conic_projection)) projections for regions covering thousands of kilometers, with the cone set to keep distance and area properties reasonable between the secant lines
- Stereographic ([STERE](https://en.wikipedia.org/wiki/Stereographic_projection)) projections for polar regions, but taking care not to rely on area and distance calculations thousands of kilometers from the center
One possible approach to automatically select a projected CRS specific to a local dataset is to create an [AEQD](https://en.wikipedia.org/wiki/Azimuthal_equidistant_projection) projection for the center-point of the study area.
This involves creating a custom CRS (with no EPSG code) with units of meters based on the center point of a dataset.
Note that this approach should be used with caution: no other datasets will be compatible with the custom CRS created, and results may not be accurate when used on extensive datasets covering hundreds of kilometers.
The principles outlined in this section apply equally to vector and raster datasets.
Some features of CRS transformation, however, are unique to each geographic data model.
We will cover the particularities of vector data transformation in Section \@ref(reproj-vec-geom) and those of raster transformation in Section \@ref(reproj-ras).
Next, Section \@ref(mapproj), shows how to create custom map projections.
## Reprojecting vector geometries {#reproj-vec-geom}
\index{CRS!reprojection}
\index{vector!reprojection}
Chapter \@ref(spatial-class) demonstrated how vector geometries are made up of points, and how points form the basis of more complex objects such as lines and polygons.
Reprojecting vectors thus consists of transforming the coordinates of these points, which form the vertices of lines and polygons.
Section \@ref(whenproject) contains an example in which at least one `sf` object must be transformed into an equivalent object with a different CRS to calculate the distance between two objects.
```{r 06-reproj-10}
london2 = st_transform(london_geo, "EPSG:27700")
```
Now that a transformed version of `london` has been created, using the **sf** function `st_transform()`, the distance between the two representations of London can be found.^[
An alternative to `st_transform()` is `st_transform_proj()` from the **lwgeom**, which enables transformations and which bypasses GDAL and can support projections not supported by GDAL.
However, at the time of writing (2024) we could not find any projections supported by `st_transform_proj()` but not supported by `st_transform()`.
]
It may come as a surprise that `london` and `london2` are over 2 km apart!^[
The difference in location between the two points is not due to imperfections in the transforming operation (which is in fact very accurate) but the low precision of the manually-created coordinates that created `london` and `london_proj`.
Also surprising may be that the result is provided in a matrix with units of meters.
This is because `st_distance()` can provide distances between many features and because the CRS has units of meters.
Use `as.numeric()` to coerce the result into a regular number.
]
```{r 06-reproj-11}
st_distance(london2, london_proj)
```
Functions for querying and reprojecting CRSs are demonstrated below with reference to `cycle_hire_osm`, an `sf` object from **spData** that represents 'docking stations' where you can hire bicycles in London.
The CRS of `sf` objects can be queried, and as we learned in Section \@ref(reproj-intro), set with the function `st_crs()`.
The output is printed as multiple lines of text containing information about the coordinate system:
```{r, out.lines=6}
st_crs(cycle_hire_osm)
```
As we saw in Section \@ref(crs-setting), the main CRS components, `User input` and `wkt`, are printed as a single entity. The output of `st_crs()` is in fact a named list of class `crs` with two elements, single character strings named `input` and `wkt`, as shown in the output of the following code chunk:
```{r 06-reproj-16}
crs_lnd = st_crs(london_geo)
class(crs_lnd)
names(crs_lnd)
```
Additional elements can be retrieved with the `$` operator, including `Name`, `proj4string` and `epsg` (see [`?st_crs`](https://r-spatial.github.io/sf/reference/st_crs.html) and the CRS and tranformation tutorial on the GDAL [website](https://gdal.org/tutorials/osr_api_tut.html#querying-coordinate-reference-system) for details):
```{r}
crs_lnd$Name
crs_lnd$proj4string
crs_lnd$epsg
```
As mentioned in Section \@ref(crs-in-r), WKT representation, stored in the `$wkt` element of the `crs_lnd` object is the ultimate source of truth.
This means that the outputs of the previous code chunk are queries from the `wkt` representation provided by PROJ, rather than inherent attributes of the object and its CRS.
Both `wkt` and `User Input` elements of the CRS are changed when the object's CRS is transformed.
In the code chunk below, we create a new version of `cycle_hire_osm` with a projected CRS (only the first 4 lines of the CRS output are shown for brevity).
```{r 06-reproj-18, eval=FALSE}
cycle_hire_osm_projected = st_transform(cycle_hire_osm, "EPSG:27700")
st_crs(cycle_hire_osm_projected)
#> Coordinate Reference System:
#> User input: EPSG:27700
#> wkt:
#> PROJCRS["OSGB36 / British National Grid",
#> ...
```
The resulting object has a new CRS with an EPSG code 27700.
But how do we find out more details about this EPSG code, or any code?
One option is to search for it online, another is to look at the properties of the CRS object:
```{r 06-reproj-19, linewidth=80}
crs_lnd_new = st_crs("EPSG:27700")
crs_lnd_new$Name
crs_lnd_new$proj4string
crs_lnd_new$epsg
```
The result shows that the EPSG code 27700 represents the British National Grid, which could have been found by searching online for "[EPSG 27700](https://www.google.com/search?q=CRS+27700)".
```{block2 06-reproj-21, type='rmdnote'}
Printing a spatial object in the console automatically returns its coordinate reference system.
To access and modify it explicitly, use the `st_crs` function, for example, `st_crs(cycle_hire_osm)`.
```
## Reprojecting raster geometries {#reproj-ras}
\index{raster!reprojection}
\index{raster!warping}
\index{raster!transformation}
\index{raster!resampling}
The projection concepts described in the previous section apply to rasters.
However, there are important differences in reprojection of vectors and rasters:
transforming a vector object involves changing the coordinates of every vertex, but this does not apply to raster data.
Rasters are composed of rectangular cells of the same size (expressed by map units, such as degrees or meters), so it is usually impracticable to transform coordinates of pixels separately.
Thus, raster reprojection involves creating a new raster object, often with a different number of columns and rows than the original.
The attributes must subsequently be re-estimated, allowing the new pixels to be 'filled' with appropriate values.
In other words, raster reprojection can be thought of as two separate spatial operations: a vector reprojection of the raster extent to another CRS (Section \@ref(reproj-vec-geom)), and computation of new pixel values through resampling (Section \@ref(resampling)).
Thus in most cases when both raster and vector data are used, it is better to avoid reprojecting rasters and to reproject vectors instead.
```{block2 06-reproj-35a, type='rmdnote'}
Reprojection of the regular rasters is also known as warping.
Additionally, there is a second similar operation called "transformation".
Instead of resampling all of the values, it leaves all values intact but recomputes new coordinates for every raster cell, changing the grid geometry.
For example, it could convert the input raster (a regular grid) into a curvilinear grid.
\index{stars (package)}
The transformation operation can be performed in R using [the **stars** package](https://r-spatial.github.io/stars/articles/stars5.html).
```
```{r, include=FALSE}
#test the above idea
library(terra)
library(sf)
con_raster = rast(system.file("raster/srtm.tif", package = "spDataLarge"))
con_raster_ea = project(con_raster, "EPSG:32612", method = "bilinear")
con_poly = st_as_sf(as.polygons(con_raster>0))
con_poly_ea = st_transform(con_poly, "EPSG:32612")
plot(con_raster)
plot(con_poly, col = NA, add = TRUE, lwd = 4)
plot(con_raster_ea)
plot(con_poly_ea, col = NA, add = TRUE, lwd = 4)
```
The raster reprojection process is done with `project()` from the **terra** package.
Like the `st_transform()` function demonstrated in the previous section, `project()` takes a spatial object (a raster dataset in this case) and some CRS representation as the second argument.
On a side note, the second argument can also be an existing raster object with a different CRS.
Let's take a look at two examples of raster transformation: using categorical and continuous data.
Land cover data are usually represented by categorical maps.
The `nlcd.tif` file provides information for a small area in Utah, USA obtained from [National Land Cover Database 2011](https://www.mrlc.gov/data/nlcd-2011-land-cover-conus) in the NAD83 / UTM zone 12N CRS, as shown in the output of the code chunk below (only first line of output shown).
```{r 06-reproj-29, results='hide'}
cat_raster = rast(system.file("raster/nlcd.tif", package = "spDataLarge"))
crs(cat_raster)
#> PROJCRS["NAD83 / UTM zone 12N",
#> ...
```
In this region, eight land cover classes were distinguished (a full list of NLCD2011 land cover classes can be found at [mrlc.gov](https://www.mrlc.gov/data/legends/national-land-cover-database-2011-nlcd2011-legend)):
```{r 06-reproj-30}
unique(cat_raster)
```
When reprojecting categorical rasters, the estimated values must be the same as those of the original.
This could be done using the nearest neighbor method (`near`), which sets each new cell value to the value of the nearest cell (center) of the input raster.
An example is reprojecting `cat_raster` to WGS84, a geographic CRS well suited for web mapping.
The first step is to obtain the definition of this CRS.
The second step is to reproject the raster with the `project()` function which, in the case of categorical data, uses the nearest neighbor method (`near`).
```{r 06-reproj-31}
cat_raster_wgs84 = project(cat_raster, "EPSG:4326", method = "near")
```
Many properties of the new object differ from the previous one, including the number of columns and rows (and therefore number of cells), resolution (transformed from meters into degrees), and extent, as illustrated in Table \@ref(tab:catraster) (note that the number of categories increases from 8 to 9 because of the addition of `NA` values, not because a new category has been created --- the land cover classes are preserved).
```{r catraster, echo=FALSE}
tibble(
CRS = c("NAD83", "WGS84"),
nrow = c(nrow(cat_raster), nrow(cat_raster_wgs84)),
ncol = c(ncol(cat_raster), ncol(cat_raster_wgs84)),
ncell = c(ncell(cat_raster), ncell(cat_raster_wgs84)),
resolution = c(mean(res(cat_raster)), mean(res(cat_raster_wgs84),
na.rm = TRUE)),
unique_categories = c(length(unique(values(cat_raster))),
length(unique(values(cat_raster_wgs84))))) |>
knitr::kable(caption = paste("Key attributes in the original (cat\\_raster)",
"and projected (cat\\_raster\\_wgs84)",
"categorical raster datasets."),
caption.short = paste("Key attributes in the original and",
"projected raster datasets"),
digits = 4, booktabs = TRUE)
```
Reprojecting numeric rasters (with `numeric` or in this case `integer` values) follows an almost identical procedure.
This is demonstrated below with `srtm.tif` in **spDataLarge** from [the Shuttle Radar Topography Mission (SRTM)](https://www2.jpl.nasa.gov/srtm/), which represents height in meters above sea level (elevation) with the WGS84 CRS:
```{r 06-reproj-32, out.lines=6}
con_raster = rast(system.file("raster/srtm.tif", package = "spDataLarge"))
cat(crs(con_raster))
```
We will reproject this dataset into a projected CRS, but *not* with the nearest neighbor method which is appropriate for categorical data.
Instead, we will use the bilinear method which computes the output cell value based on the four nearest cells in the original raster.^[
Other methods mentioned in Section \@ref(resampling) also can be used here.
]
The values in the projected dataset are the distance-weighted average of the values from these four cells:
the closer the input cell is to the center of the output cell, the greater its weight.
The following commands create a text string representing WGS 84 / UTM zone 12N, and reproject the raster into this CRS, using the `bilinear` method (output not shown).
```{r 06-reproj-34}
#| eval: false
con_raster_ea = project(con_raster, "EPSG:32612", method = "bilinear")
cat(crs(con_raster_ea))
```
Raster reprojection on numeric variables also leads to changes to values and spatial properties, such as the number of cells, resolution, and extent.
These changes are demonstrated in Table \@ref(tab:rastercrs).^[
Another minor change, which is not represented in Table \@ref(tab:rastercrs), is that the class of the values in the new projected raster dataset is `numeric`.
This is because the `bilinear` method works with continuous data and the results are rarely coerced into whole integer values.
This can have implications for file sizes when raster datasets are saved.
]
```{r rastercrs, echo=FALSE}
tibble(
CRS = c("WGS84", "UTM zone 12N"),
nrow = c(nrow(con_raster), nrow(con_raster_ea)),
ncol = c(ncol(con_raster), ncol(con_raster_ea)),
ncell = c(ncell(con_raster), ncell(con_raster_ea)),
resolution = c(mean(res(con_raster)), mean(res(con_raster_ea),
na.rm = TRUE)),
mean = c(mean(values(con_raster)), mean(values(con_raster_ea),
na.rm = TRUE))) |>
knitr::kable(caption = paste("Key attributes in the original (con\\_raster)",
"and projected (con\\_raster\\_ea) continuous raster",
"datasets."),
caption.short = paste("Key attributes in the original and",
"projected raster datasets"),
digits = 4, booktabs = TRUE)
```
```{block2 06-reproj-35, type='rmdnote'}
Of course, the limitations of 2D Earth projections apply as much to vector as to raster data.
At best we can comply with two out of three spatial properties (distance, area, direction).
Therefore, the task at hand determines which projection to choose.
For instance, if we are interested in a density (points per grid cell or inhabitants per grid cell), we should use an equal-area projection (see also Chapter \@ref(location)).
```
## Custom map projections {#mapproj}
\index{CRS!custom}
Established CRSs captured by `AUTHORITY:CODE` identifiers such as `EPSG:4326` are well suited for many applications.
However, it is desirable to use alternative projections or to create custom CRSs in some cases.
Section \@ref(which-crs) mentioned reasons for using custom CRSs and provided several possible approaches.
Here, we show how to apply these ideas in R.
One is to take an existing WKT definition of a CRS, modify some of its elements, and then use the new definition for reprojecting.
This can be done for spatial vectors with `st_crs()` and `st_transform()`, and for spatial rasters with `crs()` and `project()`, as demonstrated in the following example which transforms the `zion` object to a custom azimuthal equidistant (AEQD) CRS.
```{r}
zion = read_sf(system.file("vector/zion.gpkg", package = "spDataLarge"))
```
Using a custom AEQD CRS requires knowing the coordinates of the center point of a dataset in degrees (geographic CRS).
In our case, this information can be extracted by calculating a centroid of the `zion` area and transforming it into WGS84.
```{r, warning=FALSE}
zion_centr = st_centroid(zion)
zion_centr_wgs84 = st_transform(zion_centr, "EPSG:4326")
st_as_text(st_geometry(zion_centr_wgs84))
```
Next, we can use the newly obtained values to update the WKT definition of the AEQD CRS seen below.
Notice that we modified just two values below -- `"Central_Meridian"` to the longitude and `"Latitude_Of_Origin"` to the latitude of our centroid.
```{r}
my_wkt = 'PROJCS["Custom_AEQD",
GEOGCS["GCS_WGS_1984",
DATUM["WGS_1984",
SPHEROID["WGS_1984",6378137.0,298.257223563]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433]],
PROJECTION["Azimuthal_Equidistant"],
PARAMETER["Central_Meridian",-113.0263],
PARAMETER["Latitude_Of_Origin",37.29818],
UNIT["Meter",1.0]]'
```
This approach's last step is to transform our original object (`zion`) to our new custom CRS (`zion_aeqd`).
```{r}
zion_aeqd = st_transform(zion, my_wkt)
```
Custom projections can also be made interactively, for example, using the [Projection Wizard](https://projectionwizard.org/#) web application [@savric_projection_2016].
This website allows you to select a spatial extent of your data and a distortion property, and returns a list of possible projections.
The list also contains WKT definitions of the projections that you can copy and use for reprojections.
Also, see @opengeospatialconsortium_wellknown_2019 for details on creating custom CRS definitions with WKT strings.
\index{CRS!proj-string}
PROJ strings can also be used to create custom projections, accepting the limitations inherent to projections, especially of geometries covering large geographic areas, mentioned in Section \@ref(crs-in-r).
Many projections have been developed and can be set with the `+proj=` element of PROJ strings, with dozens of projects described in detail on the [PROJ website](https://proj.org/operations/projections/index.html) alone.
When mapping the world while preserving area relationships, the Mollweide projection, illustrated in Figure \@ref(fig:mollproj), is a popular and often sensible choice [@jenny_guide_2017].
To use this projection, we need to specify it using the proj-string element, `"+proj=moll"`, in the `st_transform` function:
```{r 06-reproj-22}
world_mollweide = st_transform(world, crs = "+proj=moll")
```
```{r mollproj, fig.cap="Mollweide projection of the world.", warning=FALSE, message=FALSE, echo=FALSE}
library(tmap)
world_mollweide_gr = st_graticule(lat = c(-89.9, seq(-80, 80, 20), 89.9)) |>
st_transform(crs = "+proj=moll")
tm_shape(world_mollweide_gr) +
tm_lines(col = "gray") +
tm_shape(world_mollweide) +
tm_borders(col = "black")
```
It is often desirable to minimize distortion for all spatial properties (area, direction, distance) when mapping the world.
One of the most popular projections to achieve this is [Winkel tripel](https://www.winkel.org/other/Winkel%20Tripel%20Projections.htm), illustrated in Figure \@ref(fig:wintriproj).^[
This projection is used, among others, by the National Geographic Society.
]
The result was created with the following command:
```{r 06-reproj-23}
world_wintri = st_transform(world, crs = "+proj=wintri")
```
```{r 06-reproj-23-tests, eval=FALSE, echo=FALSE}
world_wintri = lwgeom::st_transform_proj(world, crs = "+proj=wintri")
world_wintri2 = st_transform(world, crs = "+proj=wintri")
world_tissot = st_transform(world, crs = "+proj=tissot +lat_1=60 +lat_2=65")
waldo::compare(world_wintri$geom[1], world_wintri2$geom[1])
world_tpers = st_transform(world, crs = "+proj=tpers +h=5500000 +lat_0=40")
plot(st_cast(world_tpers, "MULTILINESTRING")) # fails
plot(st_coordinates(world_tpers)) # fails
world_tpers_complete = world_tpers[st_is_valid(world_tpers), ]
world_tpers_complete = world_tpers_complete[!st_is_empty(world_tpers_complete), ]
plot(world_tpers_complete["pop"])
```
```{r wintriproj, fig.cap="Winkel tripel projection of the world.", echo=FALSE}
world_wintri_gr = st_graticule(lat = c(-89.9, seq(-80, 80, 20), 89.9)) |>
st_transform(crs = "+proj=wintri")
library(tmap)
tm_shape(world_wintri_gr) + tm_lines(col = "gray") +
tm_shape(world_wintri) + tm_borders(col = "black")
```
```{block2 06-reproj-24, type='rmdnote', echo=FALSE}
The two main functions for transformation of simple features coordinates are `sf::st_transform()`, and `sf::sf_project()`.
`st_transform()` uses the GDAL interface to PROJ, while `sf_project()` (which works with two-column numeric matrices, representing points) uses PROJ directly.
`st_tranform()` is appropriate for most situations, and provides a set of the most often used parameters and well-defined transformations.
`sf_project()` may be suited for point transformations when speed is important.
```
```{r 06-reproj-25, eval=FALSE, echo=FALSE}
# demo of sf_project
mat_lonlat = as.matrix(data.frame(x = 0:20, y = 50:70))
plot(mat_lonlat)
mat_projected = sf_project(from = st_crs(4326)$proj4string, to = st_crs(27700)$proj4string, pts = mat_lonlat)
plot(mat_projected)
```
Moreover, proj-string parameters can be modified in most CRS definitions, for example the center of the projection can be adjusted using the `+lon_0` and `+lat_0` parameters.
The below code transforms the coordinates to the Lambert azimuthal equal-area projection centered on the longitude and latitude of New York City (Figure \@ref(fig:laeaproj2)).
```{r 06-reproj-27}
world_laea2 = st_transform(world,
crs = "+proj=laea +x_0=0 +y_0=0 +lon_0=-74 +lat_0=40")
```
```{r laeaproj2, fig.cap="Lambert azimuthal equal-area projection of the world centered on New York City.", fig.scap="Lambert azimuthal equal-area projection centered on New York City.", warning=FALSE, echo=FALSE}
world_laea2_g = st_graticule(ndiscr = 10000) |>
st_transform("+proj=laea +x_0=0 +y_0=0 +lon_0=-74 +lat_0=40.1 +ellps=WGS84 +no_defs") |>
st_geometry()
tm_shape(world_laea2_g) + tm_lines(col = "gray") +
tm_shape(world_laea2) + tm_borders(col = "black")
```
More information on CRS modifications can be found in the [Using PROJ](https://proj.org/usage/index.html) documentation.
## Exercises
```{r, echo=FALSE, results='asis'}
res = knitr::knit_child('_07-ex.Rmd', quiet = TRUE, options = list(include = FALSE, eval = FALSE))
cat(res, sep = '\n')
```