-
Notifications
You must be signed in to change notification settings - Fork 3
/
P1_rasterPackageIntroduction_I-v1.html
379 lines (320 loc) · 19.6 KB
/
P1_rasterPackageIntroduction_I-v1.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<meta name="author" content="Joao Goncalves" />
<meta name="date" content="2017-11-17" />
<title>P1_RasterIntroduction</title>
<script src="P1_rasterPackageIntroduction_I-v1_files/jquery-1.11.3/jquery.min.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="P1_rasterPackageIntroduction_I-v1_files/bootstrap-3.3.5/css/bootstrap.min.css" rel="stylesheet" />
<script src="P1_rasterPackageIntroduction_I-v1_files/bootstrap-3.3.5/js/bootstrap.min.js"></script>
<script src="P1_rasterPackageIntroduction_I-v1_files/bootstrap-3.3.5/shim/html5shiv.min.js"></script>
<script src="P1_rasterPackageIntroduction_I-v1_files/bootstrap-3.3.5/shim/respond.min.js"></script>
<script src="P1_rasterPackageIntroduction_I-v1_files/navigation-1.1/tabsets.js"></script>
<link href="P1_rasterPackageIntroduction_I-v1_files/highlightjs-1.1/default.css" rel="stylesheet" />
<script src="P1_rasterPackageIntroduction_I-v1_files/highlightjs-1.1/highlight.js"></script>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
pre:not([class]) {
background-color: white;
}
</style>
<script type="text/javascript">
if (window.hljs && document.readyState && document.readyState === "complete") {
window.setTimeout(function() {
hljs.initHighlighting();
}, 0);
}
</script>
<style type="text/css">
h1 {
font-size: 34px;
}
h1.title {
font-size: 38px;
}
h2 {
font-size: 30px;
}
h3 {
font-size: 24px;
}
h4 {
font-size: 18px;
}
h5 {
font-size: 16px;
}
h6 {
font-size: 12px;
}
.table th:not([align]) {
text-align: left;
}
</style>
</head>
<body>
<style type = "text/css">
.main-container {
max-width: 940px;
margin-left: auto;
margin-right: auto;
}
code {
color: inherit;
background-color: rgba(0, 0, 0, 0.04);
}
img {
max-width:100%;
height: auto;
}
.tabbed-pane {
padding-top: 12px;
}
button.code-folding-btn:focus {
outline: none;
}
</style>
<div class="container-fluid main-container">
<!-- tabsets -->
<script>
$(document).ready(function () {
window.buildTabsets("TOC");
});
</script>
<!-- code folding -->
<div class="fluid-row" id="header">
<h1 class="title toc-ignore">P1_RasterIntroduction</h1>
<h4 class="author"><em>Joao Goncalves</em></h4>
<h4 class="date"><em>17 November 2017</em></h4>
</div>
<div id="background" class="section level2">
<h2>Background</h2>
<hr />
<p>Geospatial data is becoming increasingly used to solve numerous ‘real-life’ problems (check out some examples <a href="http://gisgeography.com/gis-applications-uses/">here</a>). In turn, R is becoming more equipped than ever to handle this type of data thus providing an exceptional open-source solution to solve many problems in the Geographic Information Sciences and Remote Sensing domains.</p>
<p>In general two types of geospatial data models are used to represent, visualize and model spatial phenomena, these are:</p>
<ul>
<li><p><strong>Vector data</strong>: which represents the world in three simple geometries: <strong>points</strong>, <strong>lines</strong> and <strong>polygons</strong>. As such, it allows to represent spatial phenomena or variables that are typically discrete and with well-defined boundaries (e.g., touristic points-of-interest, gas stations, rivers, roads, drainage basins, country GDP).</p></li>
<li><p><strong>Raster data</strong>: provides support for representing spatial phenomena by diving the surface into a grid (or matrix) composed by cells of regular size. Each raster dataset has a certain number of columns and rows and each cell contains a value with information for the variable of interest. Stored data can be either: (i) thematic - representing a <em>discrete</em> variable (e.g., land cover classification map) or <em>continuous</em> (e.g., elevation).</p></li>
</ul>
<p>Choosing the appropriate data model to use depends on the domain of application and the specific problem at hand. Typically, people from the social sciences tend to use more the vector data model. R packages such as <strong>sp</strong> or <strong>sf</strong> (a relatively new package, starting in 2016), provide support for this type of data. In contrast, in the environmental sciences the raster data model is more often used because of satellite data or the need to represent spatially continuous phenomena such as pollution levels, temperature or precipitation values, the abundance or habitat suitability for a species among many other. The <strong>raster</strong> package, introduced in March 2010 by Robert Hijmans & Jacob van Etten, currently provides many useful functions for using this type of data. Despite these differences, GIS specialists and researchers often use both data models to tackle their problems.</p>
<p>Throughout these posts we will cover the basics, intermediate and some advanced stuff in <strong>raster data</strong> handling, manipulation and modelling in R. Examples will be given along with the tutorials. Some exercises, with different difficult levels, will be provided so you can practice.</p>
<p>The <code>raster</code> package currently provides an extensive set of functions to create, read, export, manipulate and process raster datasets.It also provides low-level functionalities for creating more advanced processing chains as well as the ability to manage large datasets. For more information see: <code>vignette("functions", package = "raster")</code>.</p>
<p>This first post on raster data is divided in two sub-sections:<br />
(i) accessing raster attributes, and, (<a href="#rstAttrib">go-to</a>)<br />
(ii) viewing raster values and calculating simple statistics (<a href="#rstValues">go-to</a>).</p>
<p>First we need to install the <code>raster</code> package (as well as <code>sp</code> and <code>rgdal</code>):</p>
<pre class="r"><code>if(!("rgdal" %in% installed.packages()[,1]))
install.packages(c("rgdal"), dependencies = TRUE)
if(!("sp" %in% installed.packages()[,1]))
install.packages(c("sp"), dependencies = TRUE)
if(!("raster" %in% installed.packages()[,1]))
install.packages(c("raster"), dependencies = TRUE)
library(rgdal)
library(sp)
library(raster)</code></pre>
<p>Next, download and unzip the sample data. We will use <a href="http://www.cgiar-csi.org/data/srtm-90m-digital-elevation-database-v4-1">SRTM - version 4.1</a> elevation data (in meters a.s.l.) for the Peneda-Geres National Park - Portugal (<a href="https://en.wikipedia.org/wiki/Peneda-Ger%C3%AAs_National_Park">+info</a>) in the examples.</p>
<pre class="r"><code># If you run it download problems try changing: method = "wget"
download.file("https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/data/srtm_pnpg.zip", "./data-raw/srtm_pnpg.zip", method = "auto")
unzip("./data-raw/srtm_pnpg.zip", exdir = "./data-raw")</code></pre>
</div>
<div id="raster-attributes" class="section level2">
<h2>Raster attributes <a name="rstAttrib"></a></h2>
<hr />
<p>In the first part (of two) of this tutorial we will focus on reading raster data and accessing its core attributes.</p>
<p>After finishing the download, load the data into R using the <code>raster</code> function (see <code>?raster</code> for more details). Then use <code>print</code> to inspect the “essential” attributes of the dataset.</p>
<pre class="r"><code># In this example the function uses a string with the data location
rst <- raster("./data-raw/srtm_pnpg.tif")
# Print raster attributes
print(rst)</code></pre>
<pre><code>## class : RasterLayer
## dimensions : 579, 555, 321345 (nrow, ncol, ncell)
## resolution : 80, 80 (x, y)
## extent : 549619.7, 594019.7, 4613377, 4659697 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=utm +zone=29 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## data source : D:\MyDocs\R-dev\R_exercises_raster_tutorial\data-raw\srtm_pnpg.tif
## names : srtm_pnpg
## values : 9, 1520 (min, max)</code></pre>
<p>From the above we can see some important information about our raster dataset. Given that we used <code>raster</code> function for data loading, we have now created a <code>RasterLayer</code>, i.e., a raster object with a single layer. We can also see its dimension: <strong>579</strong> rows, <strong>555</strong> columns and the pixel size in x and y dimensions, a.k.a. the <strong>spatial resolution</strong>, equal to 80m (we are using a projected coordinate system with units in meters; more on this below).</p>
<p>We can use the function <code>inMemory</code> to check if the raster dataset is currently stored on RAM:</p>
<pre class="r"><code>inMemory(rst)</code></pre>
<pre><code>## [1] FALSE</code></pre>
<p>As we can see, the raster data is currently stored in disk so, at this point, our <code>RasterLayer</code> object is actually “made of” metadata and a link to the actual raster data on disk. This allow to preserve RAM space.</p>
<p>The package also provides several functions to access each raster attribute individually.</p>
<pre class="r"><code>## Raster layer name(s) / more useful for multi-layer rasters
## By default coincides with the file name without extension
names(rst)</code></pre>
<pre><code>## [1] "srtm_pnpg"</code></pre>
<pre class="r"><code>## Number of rows, columns and layers
dim(rst)</code></pre>
<pre><code>## [1] 579 555 1</code></pre>
<pre class="r"><code>## Nr of rows
nrow(rst)</code></pre>
<pre><code>## [1] 579</code></pre>
<pre class="r"><code># Nr of columns
ncol(rst)</code></pre>
<pre><code>## [1] 555</code></pre>
<pre class="r"><code>## Total number of grid cells
ncell(rst)</code></pre>
<pre><code>## [1] 321345</code></pre>
<pre class="r"><code>## Spatial resolution in x and y dimensions
res(rst)</code></pre>
<pre><code>## [1] 80 80</code></pre>
<pre class="r"><code>## Data type - see ?dataType for more details
dataType(rst)</code></pre>
<pre><code>## [1] "INT2S"</code></pre>
<pre class="r"><code>## Extent (returns a S4 object of class "Extent")
extent(rst)</code></pre>
<pre><code>## class : Extent
## xmin : 549619.7
## xmax : 594019.7
## ymin : 4613377
## ymax : 4659697</code></pre>
<p>Info on extent coordinates can be retrieved individually:</p>
<pre class="r"><code>xmin(rst)</code></pre>
<pre><code>## [1] 549619.7</code></pre>
<pre class="r"><code>xmax(rst)</code></pre>
<pre><code>## [1] 594019.7</code></pre>
<pre class="r"><code>ymin(rst)</code></pre>
<pre><code>## [1] 4613377</code></pre>
<pre class="r"><code>ymax(rst)</code></pre>
<pre><code>## [1] 4659697</code></pre>
<p>Finally, we can also see info about the Coordinate Reference System (CRS) used to represent the data. Many different CRS are used to describe geographic data depending on the location, extent, time, domain (among other features) of the collected data.</p>
<pre class="r"><code>crs(rst)</code></pre>
<pre><code>## CRS arguments:
## +proj=utm +zone=29 +datum=WGS84 +units=m +no_defs +ellps=WGS84
## +towgs84=0,0,0</code></pre>
<p>For the raster package, a <strong>proj4string</strong> is used to set and define the CRS of the data. This string contains some important details of the CRS, such as the <em>Projection</em>, the <em>Datum</em>, the <em>Ellipsoid</em> and the <em>units</em> (e.g., meters, degree). You can see more info on <em>proj4</em> parameters <a href="http://proj4.org/parameters.html">here</a>. Use the site <a href="http://spatialreference.org/">spatialreference.org</a> to find the appropriate <em>proj4string</em> (or other information) for the CRS of your choice.</p>
</div>
<div id="raster-values" class="section level2">
<h2>Raster values <a name="rstValues"></a></h2>
<hr />
<p><strong>.: Summary statistics</strong></p>
<p>For the second and, last part, of this tutorial we are going to explore raster functions for visualizing, summarizing and accessing/querying values at specific locations.</p>
<p>To visualize the data we can simply use the function <code>plot</code>.</p>
<pre class="r"><code>plot(rst, main="Elevation (meters) for Peneda-Geres\n National Park, Portugal",
xlab="X-coordinates", ylab="Y-coordinates")</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P1_raster_plot-1.png" />
</div>
<p>We can also use a histogram to visualize the distribution of elevation values in the sample data.</p>
<pre class="r"><code># Generate histogram from a sample of pixels (by default 100K are randomly used)
hist(rst, col="light grey", main = "Histogram of elevation", prob = TRUE,
xlab = "Elevation (meters a.s.l.)")
# Generate the density plot object and then overlap it
ds <- density(rst, plot = FALSE)
lines(ds, col = "red", lwd = 2)</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P1_raster_hist-1.png" />
</div>
<p>Calculating summary statistics is fairly easy using the <code>raster</code> package. The generic method <code>summary</code> is available for this type of object (note: this function will use a sample of pixels to calculate statistics).</p>
<pre class="r"><code>summary(rst)</code></pre>
<pre><code>## Warning in .local(object, ...): summary is an estimate based on a sample of 1e+05 cells (31.12% of all cells)</code></pre>
<pre><code>## srtm_pnpg
## Min. 11
## 1st Qu. 529
## Median 776
## 3rd Qu. 984
## Max. 1520
## NA's 0</code></pre>
<p>Minimum and maximum values can be calculated with the following functions (no sample employed):</p>
<pre class="r"><code>## Min
minValue(rst)</code></pre>
<pre><code>## [1] 9</code></pre>
<pre class="r"><code>## Max
maxValue(rst)</code></pre>
<pre><code>## [1] 1520</code></pre>
<p>The package also provides a more general interface to calculate cell statistics using the <code>cellStats</code> function (no sample employed).</p>
<pre class="r"><code>## Mean
cellStats(rst, mean)</code></pre>
<pre><code>## [1] 747.2759</code></pre>
<pre class="r"><code>## Standard-deviation
cellStats(rst, sd)</code></pre>
<pre><code>## [1] 311.8615</code></pre>
<pre class="r"><code>## Median
cellStats(rst, median)</code></pre>
<pre><code>## [1] 774</code></pre>
<pre class="r"><code>## Median-absolute deviation (MAD)
cellStats(rst, mad)</code></pre>
<pre><code>## [1] 336.5502</code></pre>
<pre class="r"><code>## Quantiles
## 5%, 25%, 50%, 75% and 95%
cellStats(rst, function(x,...) quantile(x, probs=c(0.05, 0.25, 0.5, 0.75, 0.95),...))</code></pre>
<pre><code>## 5% 25% 50% 75% 95%
## 186 527 774 983 1224</code></pre>
<p><code>cellStats</code> does not use a random sample of the pixels to calculate hence it will fail (gracefully) for very large <code>Raster*</code> objects except for certain predefined functions: <code>sum</code>, <code>mean</code>, <code>min</code>, <code>max</code>, <code>sd</code>, <code>'skew'</code>, and <code>'rms'</code>.</p>
<p><strong>.: Extracting values</strong></p>
<p>The <code>raster</code> package allows several possibilities to extract data at specific points, lines or polygons. The <code>extract</code> function used for this purpose, allows a two-column <code>matrix</code> or <code>data.frame</code> (with x, y coordinates) or spatial objects from the <code>sp</code> package such as: <code>SpatialPoints*</code>, <code>SpatialPolygons*</code>, <code>SpatialLines</code> or <code>Extent</code> as input.</p>
<p>For the first example we will start by extracting raster values using points as input:</p>
<pre class="r"><code>## One specific point location (with coordinates in the same CRS as the input raster)
xy <- data.frame(x = 570738, y = 4627306)
xy <- SpatialPoints(xy, proj4string = crs(rst))
extract(rst, xy)</code></pre>
<pre><code>##
## 611</code></pre>
<pre class="r"><code>## Extract raster values for 20 randomly located points
xy <- data.frame(x = runif(20, xmin(rst), xmax(rst)), y = runif(20, ymin(rst), ymax(rst)))
xy <- SpatialPoints(xy, proj4string = crs(rst))
extract(rst, xy)</code></pre>
<pre><code>## [1] 137 761 1034 828 834 837 691 342 597 272 1263 935 270 1240
## [15] 339 1136 1245 991 1073 1153</code></pre>
<p>Typically we are also interested in extracting raster values for specific regions-of-interest (ROI). In this example we will use a polygon (a broad-leaf forest area) to assess the distribution of elevation values within it.</p>
<pre class="r"><code>## Download the vector data with the woodland patch ROI
## If you run it download problems try changing: method = "wget"
download.file("https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/data/WOODS_PNPG.zip", "./data-raw/WOODS_PNPG.zip", method = "auto")
## Uncompress the data
unzip("./data-raw/WOODS_PNPG.zip", exdir = "./data-raw")
## Convert the data into SpatialPolygons (discards the attached attribute but keeps geometry)
woods <- as(readOGR(dsn = "./data-raw", layer = "woods_pnpg"), "SpatialPolygons")</code></pre>
<pre><code>## OGR data source with driver: ESRI Shapefile
## Source: "./data-raw", layer: "woods_pnpg"
## with 1 features
## It has 6 fields</code></pre>
<p>Let’s check out the polygon data with a simple plot:</p>
<pre class="r"><code>## Plot elevation raster
plot(rst, main="Elevation (meters) for Peneda-Geres\n National Park, Portugal",
xlab="X-coordinates", ylab="Y-coordinates")
## Add the ROI
plot(woods, add = TRUE)</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P1_plot_ROI-1.png" />
</div>
<p>Now, let’s extract the raster values from the polygon ROI and calculate some statistics:</p>
<pre class="r"><code>elev <- extract(rst, woods)[[1]] ## Subset the first (and only) geometry element
# Tukey's five number summary: minimum, lower-hinge, median, upper-hinge, and, maximum
fivenum(elev)</code></pre>
<pre><code>## [1] 556 677 745 822 1086</code></pre>
<p>When using <code>extract</code> with a <code>SpatialPolygons*</code> object, by default, we get a <code>list</code> containing a set of raster values for each individual polygon.<br />
Now, using the extracted values we can investigate the distribution of elevation values for the target patch.</p>
<pre class="r"><code>hist(elev, main = "Histogram of ROI elevation", xlab = "Elevation (meters a.s.l.)")
abline(v = mean(elev), lwd = 2) ## Mean line</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P1_extract_pol_histogram-1.png" />
</div>
<p>This concludes our first exploration of the <code>raster</code> package - an awesome resource for handling geospatial data in R! Hope you find this post useful.</p>
<p>Cheers!</p>
</div>
</div>
<script>
// add bootstrap table styles to pandoc tables
function bootstrapStylePandocTables() {
$('tr.header').parent('thead').parent('table').addClass('table table-condensed');
}
$(document).ready(function () {
bootstrapStylePandocTables();
});
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>