-
Notifications
You must be signed in to change notification settings - Fork 1
/
package_building_presentation.Rmd
364 lines (240 loc) · 9.44 KB
/
package_building_presentation.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
---
title: "Package Building"
subtitle: "with RStudio tools"
author: "Chris Mainey"
date: "2020/08/30 (updated: `r Sys.Date()`)"
output:
xaringan::moon_reader:
lib_dir: libs
css: xaringan-themer.css
nature:
ratio: '16:9'
highlightStyle: github
highlightLines: true
countIncrementalSlides: false
seal: false
---
class: title-slide right
```{r setup, include=FALSE}
options(htmltools.dir.version = FALSE)
knitr::opts_chunk$set(fig.width=10, fig.height=6, fig.align = "center", dpi=300,
dev.args = list(png = list(type = "cairo")), fig.retina=3)
library(Cairo)
```
<br><br><br>
# Why and how to build an R package
#### Chris Mainey
#### NHS North Central London ICB
<br><br>
[c.mainey1@nhs.net](mailto:c.mainey1@nhs.net) `r icons::fontawesome("envelope")`
[mainard.co.uk](https://www.mainard.co.uk) `r icons::fontawesome("globe")`
[github.com/chrismainey](https://github.com/chrismainey/Training_Package_Building) `r icons::fontawesome("github")`
[twitter.com/chrismainey](https://twitter.com/chrismainey) `r icons::fontawesome("twitter")`
.footnote[
Background image by Leone Venter: https://unsplash.com/@fempreneurstyledstock
]
---
# Overview
+ What is an R package and why might you want to build one
+ Functions
+ Package basic structures and concepts
+ `Roxygen2` documentation and vignettes
+ Dependencies
+ Build and check
--
+ Extras:
+ Source control (Git and GitHub)
+ Unit tests and code coverage
+ Releasing to CRAN
__We'll look at this practically by building an R package__
__References:__
+ [Hadley Wickham, Jenny Bryan's: Building R packages](https://r-pkgs.org/)
+ [`R` official Documentation](https://cran.r-project.org/doc/manuals/R-exts.html#Creating-R-packages)
---
# R packages
+ Use packages all the time (base, stats, utils). You probably know `tidyverse`, `dplyr`, `ggplot2`.
+ `R` package repository CRAN has 16181 packages at time of writing.
+ Other places like Bioconductor, source packages on GitHub etc.
--
<br><br>
+ __What are they?__
+ Collections of functions related to certain workflows and tasks
+ Up to the author what goes in
+ Can be depend on other packages or be written from scratch
--
+ __Why use them?__
+ Don't reinvent the wheel!
+ Do you know how to do _[Insert thing]_ better than a topic expert?
--
+ __Why build a package?__
+ You might be the topic expert!
+ Share your code with colleagues, or others, and build collaboratively.
---
# But I can't build one...
--
.pull-left[
__...yes you can!__ You don't need perfect code to share it! _As you'll see from mine..._
<br><br>
+ You can get much of the background today and by looking online.
+ You will get it wrong a lot. That's part of the process.
<br>
Our first step is to convert our code to a format that can be run by others
+ We need to convert our code to a `function`.
]
.pull-right[
<img src="https://raw.githubusercontent.com/allisonhorst/stats-illustrations/master/rstats-artwork/code_hero.jpg" width="90%">
.footnote[
Artwork by @allison_horst: https://github.com/allisonhorst/stats-illustrations
]
]
---
# Functions
Functions have a structure that considers generic inputs and outputs, and performs something on them in between.
_(bad explanation, but stay with me...)_
```{r colsetup}
mydataframe <- data.frame(id=seq(3), old_col = c(97.5, 100, 147.5))
```
--
.pull-left[
#### You have code like this:
```{r orig2}
mydataframe$new_col <-
(mydataframe$old_col + 2.5) / 50
mydataframe
```
]
--
.pull-right[
#### Turn it into a function:
```{r asfunc2}
my_function <- function(col){
rtn <- (col + 2.5) / 50
return(rtn)
}
mydataframe$new_col2 <- my_function(mydataframe$old_col)
mydataframe
```
]
---
# Functions
.pull-left[
+ __Function has:__
+ A name (`my_function`)
+ An input (`cols`)
+ A return value (by default, returns the last line, but can be explicit with `return()`)
]
.pull-right[
```{r asfunc3, eval=FALSE}
my_function <- function(col){
new_col <- col + 2.5 / 50
return(new_col)
}
```
]
---
# Functions
.pull-left[
+ __Function has:__
+ A name (`my_function`)
+ An input (`cols`)
+ A return value (by default, returns the last line, but can be explicit with `return()`)
]
.pull-right[
```{r asfunc4, eval=FALSE}
my_function <- function(col){
new_col <- col + 2.5 / 50
return(new_col)
}
```
]
<br><br>
It's these functions that we use, and often chain together to build packages.
We need to:
+ Document them, including help files for users.
+ Tell `R` whether they depend on any other files/packages.
+ Better packages will consider error handling and how best to return values.
---
## Structure of a package
Can contain one or many functions. The parts it requires are:
+ __DESCRIPTION__ - A file with meta data about package, authors etc. Includes a list dependencies
+ __NAMESPACE__ - A short-hand file so `R` can understand what functions and dependencies to import
+ __Function(s)__ - Files with `R` functions, saved in a directory called `R`
+ __Help files__ - Text describing functions. These are written in syntax similar to latex, but it's common to generate them using `roxygen2` - a package documentation tool.
Optional:
+ __Vignette__ - Worked examples of using your functions. I think all good packages should have at least one. Again, written in same format as help files, or Rmarkdown.
+ __Unit tests__ - Automated tests help to detect errors in your code when working on a package.
---
## Dependencies
If you package requires other packages, you need to convey that so they are installed.
+ Description file can specify:
+ `Imports` - list packages that are ___required___ to use your code (added to NAMESPACE).
+ `Suggests` - A courtesy to your users, packages that are not required but help (e.g. used in vignette). Don't need to use this with a local file
--
You no longer need `Depends` since NAMESPACE files are used, and `LinkingTo` can be used to link to C++ or other files. `Enhanced` can be used to indicate your functions enhance another package.
--
Refer to Jim Hester's post: https://www.tidyverse.org/blog/2019/05/itdepends/
__Beware of tidyverse dependencies!__ Hadley's Wickham's advice:
>Because the tidyverse is a set of packages designed for interactive data analysis, this is, in short, a bad idea. The tidyverse package includes a substantial number of direct and indirect dependencies (79 packages, as of this writing), many of which are likely unnecessary for the purposes of your package. Furthermore, the CRAN maintainers frown upon depending on it, which can cause hassle for you down the line.
---
# Extras (1)
### Source control:
+ Source control is important in software development, allowing you to track changes and 'rollback'.
+ Git is common, but not only option. With Git you have a local repository, and can sync to a 'remote' like GitHub.
+ You or other users can make 'branches' to build functions before merging them together.
+ Happy Git with `R` https://happygitwithr.com/
+ GitHub automatically renders README.md when you land on it.
--
### Unit tests:
+ Use `testthat` package, use a control script called 'testthat', in a folder called 'tests'.
+ Sub-folder called 'test_that' containing test scripts.
+ See `testthat` vignettes for more details.
+ Can be combined with 'Code coverage' e.g. `covr` package.
---
# Extras (2)
### Continuous integration (CI)
+ When building larger packages collaboratively, CI builds packages after each push.
+ Recommend GitHub actions, but Travis-CI is also common.
### `pkgdown`
+ A extension builds a package website from your helpfiles, README, help files, and metadata.
+ Can then be hosted anywhere, but GitHub pages is common.
+ Can be built from CI systems
--
## Trust the `usethis` package!
---
# Lets build a package!
+ Use Rstudio to set up a new project with an 'R package template'
+ Write our functions.
+ Insert a 'Roxygen Skeleton' ('Code' menu), and fill in
+ Build our package, use CRAN checks
+ Write a brief unit test
+ Commit this all to GitHub.
+ We will use Rstudio, `devtools`, `usethis` and `roxygen2` in many places, as they set up the elements for you.
+ You need [Rtools](https://cran.r-project.org/bin/windows/Rtools/) on Windows!
___Here's one I prepared earlier:___
https://github.com/chrismainey/brilliant2
---
# Releasing to CRAN
+ CRAN is the most popular repository for packages. It has clear rules, and moderated by wonderful volunteers.
--
+ Rules around release, and package must pass CRAN checks and build on Windows, macOS and various Linux distros.
--
+ CI can help with builds, but also use Win-builder and R-Hub, through `devtools`.
--
+ Should include a 'NEWS' file and 'cran_comments' describing package, changes, and any errors in builds.
--
+ Follow all the steps in `devtools::release()`
---
class: middle
# Look at an existing package:
https://github.com/nhs-r-community/FunnelPlotR
---
# Summary
+ Packages are ways to make functions portable, helping you share and collaborate
+ Rstudio gives you package templates, and integrated with build, check and other tools
+ Metadata is saved in DESCRIPTION file
+ NAMESPACE contains import/dependency details, generated in build in `RStudio`/`devtools`/`roxygen2`
+ `R` functions saved in R folder, document then using `roxygen2`
+ Unit tests check code is working properly (`testthat` package)
+ Build (using RTools in Windows). If releasing to CRAN, use checks with `--as-cran` option.
+ Can build to binary and install locally from a ZIP file
+ GitHub, CI, codecov and `pkgdown` can all help with process