Skip to content

Latest commit

 

History

History
123 lines (86 loc) · 5.78 KB

README.md

File metadata and controls

123 lines (86 loc) · 5.78 KB

Workshop: Introduction to SQL for Querying Databases

UC Davis DataLab
Spring 2022
Instructor: Michele Tobias
Authors: Michele Tobias, Nicholas Alonzo & Nick Ulle
Editors: Nick Ulle, Pamela Reynolds, and Elise Hellwig

This workshop teaches the basics of SQL using SQLite and SQLiteStudio. This workshop provides an overview of the utility and base SQL commands for working with data in a relational database. We’ll focus on querying data to get to know a database and answer questions, and joining data from separate tables.

Goals

After this workshop learners should be able to:

  • Describe the advantages and disadvantages of using SQL with your own data.
  • Use SQL queries to view, filter, aggregate, and combine data.
  • Combine SQL keywords to develop sophisticated queries.
  • Use SQL queries to solve problems with and answer questions about data.
  • Identify additional resources for learning more about SQL, such as how to use SQL with the R programming language.

Prerequisites

No prior programming experience is necessary. We recommend learners either attend or review the written materials for DataLab'sOverview of Databases & Data Storage Technologies workshop.

Before the workshop, learners should:

Please see these recommendations for making SQLiteStudio easier to read, particularly for those with low vision and those who use a screen reader.

Data Disclosure and Appropriate Use

The database in this lesson is based on a subset of data provisioned courtesy of the UC Davis Shields Library in 2024. All unique patron identifiers have been removed. Checkouts have been assigned a deidentified patron ID value. Use of these data is restricted to educational and operational purposes and is not intended for research. Patron groupings with fewer than 5 unique patron IDs have been lumped into broader categories to maintain privacy. If you have questions regarding the dataset and use, please contact us at datalab-training@ucdavis.edu or the Library's Scholarly Communications Officer at mladisch@ucdavis.edu.

Contributing

The course reader is a live webpage, hosted through GitHub, where you can enter curriculum content and post it to a public-facing site for learners.

To make alterations to the reader:

  1. Run git pull, or if it's your first time contributing, see Setup.

  2. Edit an existing chapter file or create a new one. Chapter files are R Markdown files (.Rmd) at the top level of the repo. Enter your text, code, and other information directly into the file. Make sure your file:

    • Follows the naming scheme ##_topic-of-chapter.Rmd (the only exception is index.Rmd, which contains the reader's front page).
    • Begins with a first-level header (like # This). This will be the title of your chapter. Subsequent section headers should be second-level headers (like ## This) or below.
    • Uses caching for resource-intensive code (see Caching).

    Put any supporting resources in data/ or img/. For large files, see Large Files. You do not need to add resources generated by your R code (such as plots). The knit step saves these in docs/ automatically.

  3. Run knit.R to regenerate the HTML files in the docs/. You can do this in the shell with ./knit.R or in R with source("knit.R").

  4. Run renv::snapshot() in an R session at the top level of the repo to automatically add any packages your code uses to the project package library.

  5. When you're finished, git add:

    • Any files you added or edited directly, including in data/ and img/
    • docs/ (all of it)
    • _bookdown_files/ (contains the knitr cache)
    • renv.lock (contains the renv package list)
Then `git commit` and `git push`. The live web page will update
automatically after 1-10 minutes.

Caching

If one of your code chunks takes a lot of time or memory to run, consider caching the result, so the chunk won't run every time someone knits the reader. To cache a code chunk, add cache=TRUE in the chunk header. It's best practice to label cached chunks, like so:

```{r YOUR_CHUNK_NAME, cache=TRUE}
# Your code...
```

Cached files are stored in the _bookdown_files/ directory. If you ever want to clear the cache, you can delete this directory (or its subdirectories). The cache will be rebuilt the next time you knit the reader.

Beware that caching doesn't work with some packages, especially packages that use external libraries. Because of this, it's best to leave caching off for code chunks that are not resource-intensive.

Back to Top