This repository contains files for a two-day course on Visualization for Data Science in R, offered during Data Matters 2019. The course description and activities are listed below.
This course is designed for two audiences: experienced visualization designers looking to apply open data science techniques to their work, and data science professionals who have limited experience with visualization. Participants will develop skills in visualization design using R, a tool commonly used for data science. Basic familiarity with R is required.
Data science skills are increasingly important for research and industry projects. With complex data science projects, however, come complex needs for understanding and communicating analysis processes and results. Ultimately, an analyst's data science toolbox is incomplete without visualization skills. Incorporating effective visualizations directly into the analysis tool you are using can facilitate quick data exploration, streamline your research process, and improve the reproducibility of your research.
The course will take a project-based approach to learning best practices for visualization for data science. Participants will be guided through several sample analysis and visualization projects that will highlight different types of visualization, different features of R and its visualization libraries, and different challenges that arise when trying to apply an open data science philosophy to visualization. In short, students will learn the following:
- introduction to visualization in R
- basic syntax for ggplot2
- applying common graphic design principles to ggplot2 visualizations
- using Shiny to create interactive websites that include R data and visualizations
This course assumes basic familiarity with R -- e.g., R syntax, data structures, development environments. Visualizations will primarily be created with ggplot2 and other tidyverse libraries, but prior experience with those libraries is not required. In order to fully participate in class exercises, participants should install the following on their laptops: current versions of R, RStudio, and the following packages: tidyverse, plotly, flexdashboard, shiny, and knitr (optional).
- ggplot2 Resources
- Shiny Resources
- Example Shiny Apps