-
Notifications
You must be signed in to change notification settings - Fork 0
/
fcor599_2023.html
540 lines (457 loc) · 20.9 KB
/
fcor599_2023.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no">
<title>Geospatial Data Management and Metadata</title>
<link rel="stylesheet" href="css/reset.css">
<link rel="stylesheet" href="css/reveal.css">
<link rel="stylesheet" href="css/ubc.css">
<!-- Theme used for syntax highlighting of code -->
<link rel="stylesheet" href="lib/css/zenburn.css">
<!-- Printing and PDF exports -->
<script>
var link = document.createElement( 'link' );
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = window.location.search.match( /print-pdf/gi ) ? 'css/print/pdf.css' : 'css/print/paper.css';
document.getElementsByTagName( 'head' )[0].appendChild( link );
</script>
<style>
.lightblue {
color: #5E869F;
}
.title-bottom {
border-bottom: 2px #002145 solid;
}
</style>
</head>
<body>
<div class="reveal">
<div class="slides">
<section>
<h3>Geospatial Data Management & Metadata</h3>
<aside class="notes">
</aside>
</section>
<section>
<h2>This Presentation:</h2>
<ol>
<li>Data Management and Best Practices</li>
<li>Overview of Geospatial Metadata</li>
</ol>
<aside class="notes">
</aside>
</section>
<section>
<h4 class="title-bottom">What do you mean, "Data Management"?</h4>
<p>you plan, find, create/edit, analyze, describe, and share/preserve data</p>
<p>throughout these processes you make decisions about how to manage your data</p>
<aside class="notes">
Data has a lifecycle of use. As it passes through you make decisions about things like where to store it, who to share it with, how to describe it, etc.
</aside>
</section>
<section>
<h3>Have you ever thought:</h3>
<ul>
<li>How should I name my files?</li>
<li>Where can I store my data?</li>
<li>What should I keep track of when I make changes?</li>
<li>How will I explain my data to others?</li>
</ul>
<aside class="notes">
These are all basic questions you'd ask about how you should manage your data.
</aside>
</section>
<section>
<h3 class="title-bottom">Data Workflows</h3>
<p>GIS Analyst ⮕ IT Specialist ⮕ Community Member</p>
<p>Grad RA ⮕ Research Cluster ⮕ Funder</p>
<aside class="notes">
Not only does your data transition through it's own lifecycle, it may also be passed along through workflows. An example of a data workflow: A GIS analyst creates a wildfire urban interface dataset. It's passed along to an IT specialist to be deposited into a shared repository. Then a community volunteer uses the data to influence decisions about a new development.
</aside>
</section>
<section>
<h3>Workflows can break down</h3>
<p>inadequately described data versions</p>
<p>meaningless filenames</p>
<p>changes in storage locations</p>
<p>...what else?</p>
<aside class="notes">
These workflows have a tendency to break down if essential data management principals are not followed. I'm sure you can think of a time when you acquired data that was unusable, obsolete, or missing...
</aside>
</section>
<section>
<h3>Data management plans</h3>
<p>detailed data management plans are often necessary for funding proposals and satisfying grant requirements</p>
<aside class="notes">
</aside>
</section>
<section>
<p>data management planning tools:</p>
<p><a href="https://assistant.portagenetwork.ca/" target="_blank">DMP Assistant</a> for Canadian funding agencies like CIHR, NSERC and SSHRC </p>
<p><a href="https://dmptool.org/" target="_blank">DMPtool.org</a> for U.S. funding agencies like NIH and NSF</p>
<aside class="notes">
Here are a couple of tools that can get you started assembling a quality data management plan for your future grants, projects, or personal data management skillz.
</aside>
</section>
<section>
<h3>A few basic geodata management best practices</h3>
<aside class="notes">
Even if you're not planning on submitting a grant proposal, you may still want to follow some essential data management best practices
</aside>
</section>
<section>
<h3 class="title-bottom">File Naming Guidelines</h3>
<p>best practices include being <b>consistent</b>, and keeping file names <b>short</b> and <b>descriptive</b>.</p>
<aside class="notes">
best practices in data management includes paying attention to file names. Naming files doesn't seem super important, but it's one of the better investments you can make in good data management habits.
</aside>
</section>
<section>
<h3 class="title-bottom">File Naming - Dates</h3>
<p><b>Use YYYYMMDD format</b></p>
<p>Do: homework_20200319.txt</p>
<p>Don't: homework_19032020.txt</p>
<aside class="notes">
Dates formated this way can be chronologically sortable and searched (within "homework" or other section of your files). don't waste your or others time visually parsing dates when you can have your computer do it.
</aside>
</section>
<section>
<h3 class="title-bottom">File Naming - Identifiers</h3>
<p><b>Use unique abbreviations for project names or grants</b></p>
<p>Do: fhabc_notes.txt</p>
<p>Don't: forest_history_association_of_BC_notes.txt</p>
<aside class="notes">
choose an abbreviation or other unique labels to organize your files. This will also help keep similar files together and make them easy to "browse".
</aside>
</section>
<section>
<h3 class="title-bottom">File Naming - Descriptors</h3>
<p><b>Descriptor should be minimal but unique</b></p>
<p>Do: fhabc_grantProposal.pdf</p>
<p>Don't: fhabc.pdf</p>
<aside class="notes">
Descriptors should be just enough to describe the content.
</aside>
</section>
<section>
<h3 class="title-bottom">File Naming - Delimiters</h3>
<p><b>Use _ or - to divide your filename elements</b></p>
<p>Do: fhabc_grantProposal_v01.pdf</p>
<p>Don't: fhabc, grant proposal -->[v01].pdf</p>
<br />
<aside class="notes">
Separating filename elements with delimeters allow for more flexibility to describe and organize your files. However, many special characters can confuse software or operating systems.
</aside>
</section>
<section>
<h3 class="title-bottom">File Naming - Versions</h3>
<p><b>Note versions sequentially or with unique date and time</b></p>
<p>Do: NRC_userGuidelines_v04.doc</p>
<p>Do: MSL-fraserRiverSamples-20200319-0900.csv</p>
<p>Don't: userGuidelines_final_edits_2_forreal.doc</p>
<aside class="notes">
If you are updating the same file with versions, then make sure you're consistent with how versions are differenciated.
</aside>
</section>
<section>
<h3 class="title-bottom">File Naming - Other things</h3>
<ul>
<li>don't start filenames with a number or underscore</li>
<li>be aware of character limits</li>
<li>never ever ever use spaces as delimiters</li>
</ul>
<aside class="notes">
</aside>
</section>
<section>
<p>More info can be found using UBC Library's <a href="https://researchdata.library.ubc.ca/plan/organize-your-data/" target="_blank">research data planning guidelines</a>.</p>
<aside class="notes">
UBC Library has a very good guide on file naming best practices. The Library is also a place to start with questions regarding data management best practices. See the link for contact info.
</aside>
</section>
<section>
<h3 class="title-bottom">Attribute Naming Guidelines</h3>
<p>best practices again include being <b>consistent</b>, and keeping field names <b>short</b> and <b>descriptive</b>.</p>
<p>it's difficult to briefly describe the output of one or many calculations!</p>
<p>start a codebook if you need to abbreviate</p>
<aside class="notes">
When working with vector data you may encounter additional difficulties with naming attribute field names, especially if they're the product of a calculation using other field names. So it's good to abide by some important rules. If you need to start a codebook, use any software or tool you're comfortable using to create it!
</aside>
</section>
<section>
<h4 class="title-bottom">Attribute Naming - Character Length</h3>
<p><b>be aware of limits – Shapefile limit is 10</b></p>
<p>Do: POPDEN_20</p>
<p>Don't: population_density_2020</p>
<aside class="notes">
Other filetypes different have different specs, but a Shapefile's 10 character limit is notoriously small.
</aside>
</section>
<section>
<h4 class="title-bottom">Attribute Naming - Delimiters</h3>
<p><b>use camelCase when necessary to divide field elements</b></p>
<p>Do: fieldName</p>
<p>Don't: thisismyfieldname</p>
<aside class="notes">
Don't waste character's on _'s, when you can use camelCase to separate terms
</aside>
</section>
<section>
<h4 class="title-bottom">Attribute Naming - Codebooks</h3>
<p>list your field names and labels</p>
<p>provide description and info about each one</p>
<p>describe how values are coded or recorded</p>
<p>keep it up-to-date</p>
<aside class="notes">
Codebooks include descriptions about your data and attributes. For an example see the "Object Description" here: https://catalogue.data.gov.bc.ca/dataset/bc-wildfire-psta-head-fire-intensity
</aside>
</section>
<section>
<h3 class="title-bottom">Structuring Directories</h3>
<p>folders organize data for you AND for others</p>
<p>✔️logical</p>
<p>✔️predictable</p>
<aside class="notes">
You should consider how you want to organize your folders that contain your data and documention early on. This could be a date-based system, a filetype-based system, a project-based system, or any other predictable sytem that makes logical sense in the context of your data.
</aside>
</section>
<section>
<h3 class="title-bottom">README files</h3>
<p>text files explaining a project or parts of a project so others know what it is</p>
<p>found in top-level directories of projects</p>
<p>can link to other docs or relevant information</p>
<aside class="notes">
Create a readme file to introduce your data to others, and to keep track of things yourself.
</aside>
</section>
<section>
<p><a href="https://www.makeareadme.com/" target="_blank">makeareadme.com</a></p>
<aside class="notes">
</aside>
</section>
<section>
<h4 class="title-bottom">Version Control</h3>
<p>version control system softwares keep track of file changes</p>
<p>essentially a database of changes</p>
<aside class="notes">
Version control systems are not required for your data management workflows, but they sure can help to keep track of changes.
</aside>
</section>
<section>
<h4 class="title-bottom">Version Control</h3>
<p>different types of systems for different industries</p>
<p><a href="https://git-scm.com/" target="_blank">Git</a> is very common and widely integrated</p>
<p><a href="https://kartproject.org/" target="_blank">Kart</a> is emerging but specific to geodata</p>
<aside class="notes">
There are a LOT of version control ssytems. If you decide you want to use one, consider the trends in your profession/industry.
</aside>
</section>
<section>
<h4 class="title-bottom">Data Preservation</h3>
<p>data preservation ensures long-term access to and use of data – beyond limits of media</p>
<p>includes procedures regarding <b>file formats</b>, <b>copyright and permissions</b>, <b>persistent storage and geographic location</b>, and <b>metadata</b>.</p>
<aside class="notes">
Data preservation refers to things beyond, backing things up on a hard drive. This is a HUGE topic in libraries as we continually transistion to digital collections that require long-term storage and use.
</aside>
</section>
<section>
<h4 class="title-bottom">Data Preservation - File formats</h3>
<p>decide which file formats are the most reliable and persistent for your data</p>
<p>prioritize platform-independent, character-based formats</p>
<p>prioritize UTF-8 character encoding</p>
<aside class="notes">
Have you ever had to purchase a software or license just to open a file? Have you tried to open a file that was obsolete? Platform independant, character based (textual) formats tend to be MUCH more flexible in the long run compared with binary, software-specific file formats.
</aside>
</section>
<section>
<h4 class="title-bottom">Now let's talk about metadata</h4>
<p>metadata describes your data so it can be used, shared, and understood widely</p>
<aside class="notes">
Metadata is information that describes information (or anything really – think about product labels). It's super important to understand the value of metadata and how it can continually beneft you, your data, and users of your data.
</aside>
</section>
<section>
<h4 class="title-bottom">Metadata in Plain Language</h4>
<p>Questions you need to be prepared to answer about your data:</p>
<p><a href="https://geology.usgs.gov/tools/metadata/tools/doc/ctc/" target="_blank">USGS Metadata in Plain Language</a></p>
<aside class="notes">
Take a look at these questions. These are things about your data that you should have documented
</aside>
</section>
<section>
<h4 class="title-bottom">Examples</h4>
<p><a href="https://catalogue.data.gov.bc.ca/dataset/fire-incident-locations-historical" target="_blank">metadata formatted for web discovery</p>
<p><a href="https://raw.githubusercontent.com/OpenGeoMetadata/edu.stanford.purl/master/sv/613/mt/4586/iso19139.xml" target="_blank">xml-encoded metadata</a></p>
<aside class="notes">
Here are two examples of metadata records for geodata. The first link goes to a web page with metadata elements formatted for discovery. Metadata elements are there (title, contact info, despription, purpose, etc.), but they've been styled for human readability. The second link is plain XML – meant for programmatic discovery and transformation.
</aside>
</section>
<section>
<h4 class="title-bottom">Difficulties</h4>
<p>frankly, metadata is pretty boring</p>
<p>it takes a lot of time</p>
<p>lots of standards, no clear best choice</p>
<aside class="notes">
Its true. Metadata is not a sexy topic, and there are lots of more exciting things you'd rather do with your data than document and describe it.
</aside>
</section>
<section>
<h4 class="title-bottom">bad metadata negatively affects:</h4>
<ul>
<li>integrity</li>
<li>discoverability</li>
<li>preservability</li>
<li>useablity</li>
</ul>
<aside class="notes">
If you don't create good metadata, it will have significant negative affects on your data and your credibility.
</aside>
</section>
<section>
<h4 class="title-bottom">4 main metadata types</h4>
<ol>
<li>descriptive</li>
<li>technical</li>
<li>discovery</li>
<li>administrative</li>
</ol>
<aside class="notes">
You can think about metadata in these 4 categories or types.
</aside>
</section>
<section>
<h4 class="title-bottom">Descriptive Metadata</h4>
<p>includes things like:</p>
<ul>
<li>abstract/methodology</li>
<li>attribute descriptions</li>
<li>purpose</li>
<li>uncertainty errors</li>
<li>access</li>
</ul>
<aside class="notes">
Descriptive metadata is pretty self explanatory. This gives you and others all the necessary information describing your data or study, including how it was derived and uncertainty errors.
</aside>
</section>
<section>
<h4 class="title-bottom">Technical Metadata</h4>
<p>includes things like:</p>
<ul>
<li>CRS / projection / datum</li>
<li>attribute data types</li>
<li>software used</li>
<li>character encoding</li>
</ul>
<aside class="notes">
Technical metadata is highly subject-specific. Different industries may have different technical metadata requirements.
</aside>
</section>
<section>
<h4 class="title-bottom">Discovery Metadata</h4>
<p>includes things like:</p>
<ul>
<li>title</li>
<li>date</li>
<li>keywords</li>
<li>geographic extent</li>
</ul>
<aside class="notes">
Discovery metadata is used to index and link your data with others. When you search and filter data by subject, you're using discovery metadata.
</aside>
</section>
<section>
<h4 class="title-bottom">Administrative Metadata</h4>
<p>includes things like:</p>
<ul>
<li>copyright</li>
<li>contact info</li>
<li>status</li>
</ul>
<aside class="notes">
</aside>
</section>
<section>
<h3 class="title-bottom">Metadata Standards</h3>
<p>why have metadata standards?</p>
<ul>
<li>ease transformation/conversion</li>
<li>ensure proper interpretation</li>
</ul>
<aside class="notes">
You all probably understand the general need for standards, and those same things can apply to metadata. Yes, there are a lot of metadata standards, and that's ok. They can also be flexible.
</aside>
</section>
<section>
<h3 class="title-bottom">Metadata Standards</h3>
<p>2 main geospatial metadata standards</p>
<ul>
<li><a href="https://www.fgdc.gov/standards" target="_blank">FGDC</a></li>
<li><a href="https://www.iso.org/standard/53798.html" target="_blank">ISO 19115</a>/<a href="https://www.iso.org/standard/67253.html" target="_blank">19139</a></li>
</ul>
<aside class="notes">
In GIS there are two widely used standards that have rules for what should be included in metadata records, how it is described, and how it's structured/encoded.
</aside>
</section>
<section>
<h3 class="title-bottom">Metadata Standard - ISO</h3>
<p>flexible and internationally recognized</p>
<p>generally recommended</p>
<p>complex</p>
<p>documentation costs money</p>
<aside class="notes">
ISO is has recently emerged as the recommended choice between the two due to it's felxibility, comprehensiveness, and wide recognition. Complete documentation for ISO standards do cost some money, but there are tools you can use to automatically encode your metadata to ISO standards.
</aside>
</section>
<section>
<p>don't worry – there are several tools to help you create and edit metadata!</p>
<aside class="notes">
</aside>
</section>
<section>
<h3 class="title-bottom">Metadata tools and editors</h3>
<p><a href="https://pro.arcgis.com/en/pro-app/help/metadata/best-practices-for-editing-metadata.htm" target="_blank">ArcGIS Pro</a></p>
<p><a href="http://catmdedit.sourceforge.net/" target="_blank">catMDEdit</a></p>
<p><a href="https://www.mdeditor.org/" target="_blank">mdEditor.org</a></p>
<p><a href="https://geonetwork-opensource.org/" target="_blank">GeoNetwork</a></p>
<p><a href="https://wiki.osgeo.org/wiki/Metadata_software" target="_blank">and more!!</a></p>
<aside class="notes">
Probably the most common way to create and edit geographic metadata is by using a Desktop GIS software. but there are other tools you can use too if you are not using a GIS.
</aside>
</section>
<section>
<p>creating metadata can be tedious. But remember: metadata will make your data more reproducible, sharable, and impactful.</p>
<aside class="notes">
</aside>
</section>
<section>
<p>motivational quote:</p>
<q>metadata is a love note to the future</q>
<aside class="notes">
No doubt you will thank yourself later.
</aside>
</section>
<section>
<h3><span class="lightblue">Thanks!</span></h3>
<p>Evan Thornberry</p>
<p>evan.thornberry@ubc.ca</p>
</section>
</div>
</div>
<script src="js/reveal.js"></script>
<script>
// More info about config & dependencies:
// - https://github.com/hakimel/reveal.js#configuration
// - https://github.com/hakimel/reveal.js#dependencies
Reveal.initialize({
dependencies: [
{ src: 'plugin/markdown/marked.js' },
{ src: 'plugin/markdown/markdown.js' },
{ src: 'plugin/notes/notes.js', async: true },
{ src: 'plugin/highlight/highlight.js', async: true }
]
});
</script>
</body>
</html>