-
Notifications
You must be signed in to change notification settings - Fork 0
/
lesson.Rmd
274 lines (180 loc) · 8.38 KB
/
lesson.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
---
title: "Lesson"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
#Benefits of coding
-Reproducible
-Accurate
#Rstudio
Rstudio is an integrated development environment (IDE) for R.
Panes:
-Console (like R x64 4.04)
-Source (R files)
-Environment, History, Connections, Tutorials
-Files, Plots, Packages, Help, Viewer
#Rmarkdown
-Script: a text file for storing and running code (comments have a hash-mark before them)
-R Markdown: interactive file for writing anything from notes to entire manuscripts
contains code "chunks" for storing and running code
can be easily converted to (knitted into) an html file, a word document, or a PDF
#Variables, Vectors, and Functions
Definitions:
-Variable: an object that stores information, i.e. a number, text, or a vector
-Vector: a data structure that contains multiple elements of the same type (class)
-Function: an object that takes arguments (aka numbers or vectors) as inputs, performs a task on those arguments, and sometimes has an output. The output of a function can be stored as a variable
#Code from script:
```{r}
3+2 #add 3+2
mean(c(1,2,3)) #take the mean of 1, 2, and 3
y <- 3+2 #store the sum of 3 and 2 as a variable called y
y #print y
remove(y) #remove y from the environment
y <- c(1,2,3) #store the vector of 1, 2, and 3 as a variable called y
mean(y) #Use the function, mean(), to calculate the mean of vector y (to calculate the mean of 1, 2, and 3) and print the answer (aka output of the function)
z <- mean(y) #Store the output from mean(y) in the variable, z
```
Character strings
Examples:
```{r}
```
**Exercises:**
1. Make a new variable containing the number 13
2. Make a new variable 1345 times the variable above (doing the calculation in code)
3. Make new variables containing the name of you and your friend
```{r}
```
#Directories and R projects
-R works with your computer's file system.
-Where does R think we are?
```{r}
getwd()
```
-R Project: a file that points R at whatever folder the file is saved in
The final pane:
-Files
Mini file explorer for R Projects
-Now, after creating a project file, where does R think we are?
```{r}
getwd()
```
#Packages and the Tidyverse
Package: A collection of functions. Note, sometimes packages contain multiple packages, or multiple collections of functions, i.e. tidyverse.
- Install package by using `install.packages("package-name")`
- Update packages by using `update.packages("package-name")`
- Load packages by using `library("package-name")`
- tidyverse is a useful collection of packages
- Possible to install packages from places other than cran (e.g. github, r-forge, local computer)
- ways to install
Good instructions [here if needed](https://datacarpentry.org/R-ecology-lesson/#setup_instructions)
Install tidyverse
```{r}
install.packages("tidyverse")
```
Load tidyverse into R: Even though it is installed, R cannot use any of the functions from the tidyverse package until it is loaded into the session.
```{r}
library(tidyverse)
```
You must reload packages every session so it is easiest to code out which packages you need in your R markdown or script and run the code at the beginning of each session.
The final pane:
-Packages
See all installed packages
Loaded packages will have a tick mark next to them
#CSV files and Data Frames
```{r}
read_csv("Data/Bats_data.csv")
```
**Exercises:** Breakout Session
1. Store Bats_data.csv in a variable, i.e. bats
Hint: You'll need the code above and the assign arrow.
2. Use the View() function to look at the data. i.e.View(bats)
Note: View() is a function that takes one argument. You can use your variable as the argument just like when we used mean(y) when y was the variable name.
```{r}
```
#Manipulating Data
Each column in a data frame is a vector
-$: extracts a vector from the data frame
Examples:
```{r}
```
## Logicals in R
- == Equal to
- < Less than
- > Greater than
- <= Less than or equal to
- >= Greater than or equal to
- | or
- & and
- ! not
Examples:
```{r}
```
##filter(.data, ..., .preserve = FALSE)
.data : your data. This needs to be a data frame.
... : an expression or multiple expressions that return a logical value, and are defined in terms of the variables (column names) in .data.
Examples:
```{r}
```
**Exercises:** Breakout Session
Find the maximum Activity observed on 7/01/2013
1. Filter by Date for 7/01/2013
2. Save the filtered data as a new variable, i.e. filtered_bats
3. Use the max() function on the column Activity from your new data frame. Hint: use $ to access the vector
```{r}
```
#Visualizing data: Teaser for future workshops
```{r}
##Boxplot of Treatment_thinned on the x-axis and Activity on the y-axis
bats %>%
ggplot()+ #If no mapping in ggplot, any additional geom_ functions will need their own mapping statements, see below
geom_boxplot(mapping = aes(x = Treatment_thinned, y = Activity)) #boxplot
#Boxplots separated out by date
bats %>%
ggplot(mapping = aes(x = Treatment_thinned, y = Activity))+
geom_boxplot()+ #boxplot
facet_wrap(facets = vars(Date)) #creates a new plot boxplot for each unique date
#or....
plot <- bats %>%
ggplot(mapping = aes(x = Treatment_thinned, y = Activity))+
geom_boxplot() #boxplot
plot +
facet_wrap(facets = vars(Date)) #Tip: facets = ~Date is Shorthand for facets = vars(Date).
#_______________________________________________________________________________
##Scatter plot of
plot2 <- bats %>%
ggplot(mapping = aes(x = log(Foraging), y = Activity))+ #if mapping in ggplot, x and y will be handed to any geom_ functions afterwards.
geom_point()+ #scatter plot
geom_smooth(method = "lm", se = FALSE) #plot a smooth curve of the linear regression line (linear model) of the data
plot
plot2
```
# Resources
Personal Recommendations:
- ["RStudio Primers" Step by step interactive tutorials](https://rstudio.cloud/learn/primers)
- ["R Resources for Beginners"](https://unsw-coders.netlify.app/resources/2021-03-22-beginner-resources/)
- ["Functions"](https://www.stat.berkeley.edu/~statcur/Workshop2/Presentations/functions.pdf)
Creating and manipulating variables
- ["Making new variables" on environmentalcomputing](http://environmentalcomputing.net/making-new-variables/)
- ["Creating R objects" on DataCarpentry](https://datacarpentry.org/R-ecology-lesson/01-intro-to-r.html)
- ["Software carpentry"'s intor to RStudio'](http://swcarpentry.github.io/r-novice-gapminder/01-rstudio-intro/index.html)
A few resources for loading other file types
- [readxl](https://readxl.tidyverse.org/reference/read_excel.html): Reading data direct from Excel
- [datapasta](https://cran.r-project.org/web/packages/datapasta/): when copying and pasting data from the web
- [googlesheets4](https://googlesheets4.tidyverse.org/): for reading data from google sheets
Online courses:
- [Environmental Computing](http://environmentalcomputing.net/)
- [Software carpentry's R novice lesson](http://swcarpentry.github.io/r-novice-gapminder/)
- [Datacarpentry's lesson on R for ecology](https://datacarpentry.org/R-ecology-lesson/)
- [A paper looking at how much R is used in Ecology](https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/ecs2.2567)
- [The tidyverse web page](https://www.tidyverse.org/)
- [Hadley Wickham's book R for data Science](https://r4ds.had.co.nz/)
Learning to code and need inspiration ?
- [Anyone can code](https://www.youtube.com/watch?v=qYZF6oIZtfc&list=PLzdnOPI1iJNe1WmdkMG-Ca8cLQpdEAL7Q)
- [Coding is the new literacy](https://www.youtube.com/watch?v=MwLXrN0Yguk&list=PLzdnOPI1iJNe1WmdkMG-Ca8cLQpdEAL7Q)
- [What most schools don't teach](https://www.youtube.com/watch?v=nKIu9yen5nc&feature=c4-overview-vl&list=PLzdnOPI1iJNe1WmdkMG-Ca8cLQpdEAL7Q)
- when coding saves you time, [from XKCD](http://xkcd.com/1205/)
Why good code matters
- [Why I want to write nice R code](http://nicercode.github.io/blog/2013-04-05-why-nice-code/)
- [Science has a credibility problem](http://www.economist.com/news/leaders/21588069-scientific-research-has-changed-world-now-it-needs-change-itself-how-science-goes-wrong)