This repository contains supplemental resources and materials that coincide with the "Making Data Pipelines in R" talk at RStudio::conf(2022).
"Making Data Pipelines in R" is a story that aims to present high-level concepts that can be useful for creating a data pipeline in R from scratch within the context of the user being self-taught in the R programming language. This context is relevant because any programmer (Especially self-taught programmers) can have knowledge gaps that may make creating automated data pipelines daunting, if not impossible.
This repository can serve as a general learning tool, resource, and source of inspiration for those who want to begin creating data pipelines from scratch in R. Data pipelines is an expansive topic that is fluid and varies by industry, organizational setting, and professional use case. Your mileage may vary. Feel free to use anything in this repository that may help you in your own pipeline adventures!
The slides for "Making Data Pipelines in R" can be found on the repository here.
This talk being presented on July 27th 2022, 1:30 PM EST at the Gaylord National Convention Center in National Harbor,(Maryland/D.C) United States. A recording to this talk will be provided here when it is available.
Example (Non-Technical) Documents like metadata tables and data workflow diagrams that can be used to disseminate general pipeline information. These documents can be found for modification and download here.
Example R Projects and scripts can be found on the repository here.
For more information about how to fork,clone, or pull down repositories for your own practice/use on Github, please refer to to this Git Docs Article.
For those looking for more complicated scripts that exercises knowledge of intermediate script modularization, custom functions, and script chaining, you may be interested in the example R Project "simple_pipeline" located here
For those that want an lighter introduction to chaining scripts together without worrying about intermediate knowledge of working directories, you may be interested in the example R project "even_simpler_pipeline" located here
A breakdown of R documentation, packages, and other references that can be useful for making data pipelines in R that can be found here.