The targets
package is a
Make-like pipeline tool for
statistics and data science in R. With targets
, you can maintain a
reproducible workflow without repeating yourself. targets
skips costly
runtime for tasks that are already up to date, orchestrates the
necessary computation with implicit parallel computing, and abstracts
files as R objects. An up-to-date targets
pipeline is tangible
evidence that the output aligns with the code and data, which
substantiates trust in the results.
A pipeline is a computational workflow that does statistics, analytics,
or data science. Examples include forecasting customer behavior,
simulating a clinical trial, and detecting differential expression from
genomics data. A pipeline contains tasks to prepare datasets, run
models, and summarize results for a business deliverable or research
paper. The methods behind these tasks are user-defined R functions that
live in R scripts, ideally in a folder called "R/"
in the project. The
tasks themselves are called “targets”, and they run the functions and
return R objects. The targets
package orchestrates the targets and
stores the output objects to make your pipeline efficient, painless, and
reproducible.
- Familiarity with the R programming language, covered in R for Data Science.
- Data science workflow management techniques.
- How to write functions to prepare data, analyze data, and summarize results in a data analysis project.
Type | Source | Command |
---|---|---|
Release | CRAN | install.packages("targets") |
Development | GitHub | remotes::install_github("ropensci/targets") |
Development | rOpenSci | install.packages("targets", repos = "https://dev.ropensci.org") |
The 4-minute video at https://vimeo.com/700982360 demonstrates the example pipeline used in the walkthrough and functions chapters of the user manual. Visit https://github.com/wlandau/targets-four-minutes for the code and https://rstudio.cloud/project/3946303 to try out the code in a browser (no download or installation required).
To create a pipeline of your own:
- Write R
functions for a
pipeline and save them to R scripts (ideally in the
"R/"
folder of your project). - Call
use_targets()
to write key files, including the vital_targets.R
file which configures and defines the pipeline. - Follow the comments in
_targets.R
to fill in the details of your specific pipeline. - Check the pipeline with
tar_visnetwork()
, run it withtar_make()
, and read output withtar_read()
. More functions are available.
- User manual: in-depth
discussion about how to use
targets
. - Reference website: formal documentation of all user-side functions, the statement of need, and multiple design documents of the internal architecture.
- Developer
documentation: software
design documents for developers contributing to the deep internal
architecture of
targets
.
- Get started with
targets
in 4 minutes (4:08) - R/Medicine 2021 (15.33)
- R/Pharma 2020 (9:24)
- LA R Users Meetup, October 2020 (1:14:40)
- New York Open Statistical Programming Meetup, December 2020 (1:54:28)
- ds-incubator series, 2021
- Lille R User Group, June 2021 (45:54)
- Four-minute example
- Minimal example
- Machine learning with Keras
- Validate a minimal Stan model
- Using Target Markdown and
stantargets
to validate a Bayesian longitudinal model for clinical trial data analysis - Shiny app that runs a pipeline
- Deploy a pipeline to RStudio Connect
tar_watch()
: a built-in Shiny app to visualize progress while a pipeline is running. Available as a Shiny module viatar_watch_ui()
andtar_watch_server()
.targetsketch
: a Shiny app to help sketch pipelines (app, source).
- https://solutions.rstudio.com/r/workflows/ explains how to deploy a pipeline to RStudio Connect (example code).
tar_github_actions()
sets up a pipeline to run on GitHub Actions. The minimal example demonstrates this approach.
- R Targetopia: a collection
of R packages
that extend
targets
. These packages simplify pipeline construction for specific fields of Statistics and data science. - Target factories: a programming technique to write specialized interfaces for custom pipelines. Posts here and here describe how.
- Post to the GitHub discussion
forum to ask
questions. To get the best help about a specific issue, create a
reproducible example with
targets::tar_reprex()
orreprex::reprex()
. - The RStudio Community forum is
full of friendly enthusiasts of R and the tidyverse. Use the
targets
tag. - Stack Overflow broadcasts to the
entire open source community. Use the
targets-r-package
tag.
Please note that this package is released with a Contributor Code of Conduct.
citation("targets")
To cite targets in publications use:
Landau, W. M., (2021). The targets R package: a dynamic Make-like
function-oriented pipeline toolkit for reproducibility and
high-performance computing. Journal of Open Source Software, 6(57),
2959, https://doi.org/10.21105/joss.02959
A BibTeX entry for LaTeX users is
@Article{,
title = {The targets R package: a dynamic Make-like function-oriented pipeline toolkit for reproducibility and high-performance computing},
author = {William Michael Landau},
journal = {Journal of Open Source Software},
year = {2021},
volume = {6},
number = {57},
pages = {2959},
url = {https://doi.org/10.21105/joss.02959},
}