UC Davis DataLab
Spring 2024
Instructor: Michele Tobias <mmtobias@ucdavis.edu>
Maintainer: Michele Tobias <mmtobias@ucdavis.edu>
Authors: Michele Tobias <mmtobias@ucdavis.edu> & Naomi Kalman <nbkalman@ucdavis.edu>
YOUR DESCRIPTION, LEARNING GOALS, PREREQUISITES, ETC
This workshop is intended to give participants an introduction to working with spatial data using SQL. We will work with a graphical user interface (GUI) and explore some examples of common analysis processes as well as present participants with resources for continued learning. This workshop will give participants a solid foundation on which to build further learning.
By the end of this workshop, participants will be able to
- Import data into an spatialite database
- Write queries to answer questions about spatial data
- Understand the difference between attribute queries and geometry queries
- View spatial tables and views in QGIS
- Use terminology related to spatial databases to facilitate future learning
An introductory understanding of SQL is recommended, but not mandatory. For example, knowing how to compose a SELECT statement using SQL and the general concept of joining tables will serve learners well. For learners without a foundation in SQL, we recommend attending or reviewing the materials for DataLab's Introduction to Databases and Data Storage Technologies, which introduces the concept of databases, and Intro to SQL for Querying Databases, which teaches the basics of querying data using SQL.
The course reader is a live webpage, hosted through GitHub, where you can enter curriculum content and post it to a public-facing site for learners.
To make alterations to the reader:
-
Check in with the reader's current maintainer and notify them about your intended changes. Maintainers might ask you to open an issue, use pull requests, tag your commits with versions, etc.
-
Run
git pull
, or if it's your first time contributing, see Setup. -
Edit an existing chapter file or create a new one. Chapter files are R Markdown files (
.Rmd
) at the top level of the repo. Enter your text, code, and other information directly into the file. Make sure your file:- Follows the naming scheme
##_topic-of-chapter.Rmd
(the only exception isindex.Rmd
, which contains the reader's front page). - Begins with a first-level header (like
# This
). This will be the title of your chapter. Subsequent section headers should be second-level headers (like## This
) or below. - Uses caching for resource-intensive code (see Caching).
Put any supporting resources in
data/
orimg/
. For large files, see Large Files. You do not need to add resources generated by your R code (such as plots). The knit step saves these indocs/
automatically. - Follows the naming scheme
-
Run
knit.R
to regenerate the HTML files in thedocs/
. You can do this in the shell with./knit.R
or in R withsource("knit.R")
. If you would like to include a PDF copy of the reader, add the flag-p
(or--pdf
). This will generate adocs/_main.pdf
file, available for download via the menu bar in the reader. -
Run
renv::snapshot()
in an R session at the top level of the repo to automatically add any packages your code uses to the project package library. -
When you're finished,
git add
:- Any files you added or edited directly, including in
data/
andimg/
docs/
(all of it)_bookdown_files/
(contains the knitr cache)
renv.lock
(contains the renv package list)
- Any files you added or edited directly, including in
Then `git commit` and `git push`. The live web page will update
automatically after 1-10 minutes.
If one of your code chunks takes a lot of time or memory to run, consider
caching the result, so the chunk won't run every time someone knits the
reader. To cache a code chunk, add cache=TRUE
in the chunk header. It's
best practice to label cached chunks, like so:
```{r YOUR_CHUNK_NAME, cache=TRUE}
# Your code...
```
Cached files are stored in the _bookdown_files/
directory. If you ever want
to clear the cache, you can delete this directory (or its subdirectories).
The cache will be rebuilt the next time you knit the reader.
Beware that caching doesn't work with some packages, especially packages that use external libraries. Because of this, it's best to leave caching off for code chunks that are not resource-intensive.
GitHub Actions can be set up to automatically render your reader when you push new content to a repo. If you would like to use this function, download the materials in datalab-dev/utilities/render_bookdown_site and follow the instructions there.
This repo uses renv for package management. Install renv according to the installation instructions on their website.
Then open an R session at the top level of the repo and run:
renv::restore()
This will download and install the correct versions of all the required packages to renv's package library. This is separate from your global R package library and will not interfere with other versions of packages you have installed.