-
Notifications
You must be signed in to change notification settings - Fork 15
/
index.qmd
127 lines (81 loc) · 4.55 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---
title: "R workshop - get started"
date-modified: 'today'
date-format: long
license: CC BY-NC
bibliography: references.bib
---
**Let's learn R**
::: column-margin
[![](images/DALL-E_2023-07-13_learn_R.png){fig-alt="Learn R"}](images/DALL-E_2023-07-13_learn_R.png)
:::
- Basic
- RStudio **projects** and **reproducibility** with Quarto
- Import, wrangle, and summarize data
- Visualize --- plus simple *interactivity*
- Tidy data --- plus *pivoting* and *joining* data frames
- Regression --- just a taste
- An example of joining data and Exploratory Data Analysis
- Advanced
- Interactive visualization with ObservableJS
- Custom functions
- Iteration with {purrr}
- Tidymodels
::: callout-tip
## Quarto and reproducibility
This is a [Quarto website](https://quarto.org/docs/websites "quarto website documentation") authored with [R](https://www.r-project.org/ "download R") and the [RStudio IDE](https://posit.co/products/open-source/rstudio/ "Download RStudio"). Quarto is part of your reproducible workflow and helps you render your analysis into various formats (PDF, Word, Website, Interactive, etc.)
:::
## First steps
1. Tour of your Local File System
- R / RStudio
- Clone code from GitHub
2. Start a new project by importing the github repository for this workshop
- <https://github.com/data-and-visualization/Intro2R.git>
3. Open the `import.qmd` notebook-file from the *Files* tab in the RStudio IDE
## Download Project (data and code)
All data and code are readily available for this workshop.
### Data
Whenever possible on-board datasets are used. Therefore, you may not have to load any data at all. For most of the data you only to [load a library](packages.html#load-packages). For example the `starwars` dataset is part of the {dplyr} package --- `library(dplyr)` to access the starwars data frame. Alternatively, you can access the data with the fully verbose *package::function* syntax --- `dplyr::starwars`.
Some data sets are imported via the `read_csv` function and you can download the data and code from github (See GitHub icon; right-hand side of this webpage header), expand the zipped file. You'll find a data folder with CSV files used in this workshop. Refer to the [import](import.html) page for tips on importing data.
### Code
You can download the data and code from github (See GitHub icon; right-hand side of this webpage header), expand the zipped file. Double click the Intro2R.Rproj file to launch all of the code directly into an RStudio project.
::: column-margin
![Download zipped code from GitHub](images/github_download.jpg){fig-alt="download code"}
:::
Or
Use this link to [the GitHub repo](https://github.com/data-and-visualization/Intro2R.git "github code repo"),
1. Click the green code button
2. Download ZIP
3. Unzip and double click the `Intro2R.Rproj` file
## Pipes and Assignment
Some coding syntax is unique in R and the Tidyverse. You will see these symbols frequently.
::: column-margin
```{=html}
<iframe width="300" height="200" src="https://www.youtube.com/embed/FK5UKBT-8iw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
```
:::
::: callout-note
## `<-` Assignment - "*gets value from*"
By convention we use two assignment symbols when creating object names
> `<-` typically used at the beginning of a pipeline or function
>
> `=` typically used within the `mutate` function
**Keystroke**:\
Alt-dash
**Example**:\
`my_vector <- c(2, 4, 6:9)`
:::
::: callout-note
## `%>%` Pipes and pipelines - "*and then*"
We can create *data sentences*, or **pipelines**, by chaining many functions together from left to right.
There are at least three main pipe symbols
> `|>` or `%>%`
>
> `+` only used in ggplot2 pipelines
**Keystroke**:\
Ctrl-shift-M or Cmd-shift-M
**Example:**\
`cars |> select(speed)`
:::
## More workshops
[This workshop and many others](https://library.duke.edu/data/workshops) covering topics in data science, visualization, mapping and GIS, and data management are hosted each semester by the **Center for Data and Visualization Sciences** (CDVS). All workshops are recorded with shareable code, data, slides and recordings. You can find the [R-specific resources at our Rfun](https://rfun.library.duke.edu) site. Meanwhile, *all CDVS workshop* materials are available at our [online learning page](https://library.duke.edu/data/workshops) --- including duplicates of the R materials found at Rfun.