To the extent possible under law,
Titus Brown and
Shannon Joslin
have waived all copyright and related or neighboring rights to
GGG 298, Winter 2020 at UC Davis.
This work is published from:
United States.
C. Titus Brown (IOR) (ctbrown@ucdavis.edu), Shannon Joslin (sejoslin@ucdavis.edu).
This course will provide a practical introduction to common tools used in data-intensive research, including the UNIX shell, version control with git, RMarkdown, JupyterLab, and workflows with snakemake. The associated discussion section will connect the lab practicals to foundational concepts in data science, including repeatability/reproducibility, statistics, and publication ethics.
This course is open to all graduate students. No prior computational experience is required or assumed. There will be some minimal overlap with GGG 201(b) topics. All materials will be open to the community and freely available online.
Week 1: Introduction to the course, and a basic RNAseq pipeline -- outline, lab notes, reading, discussion notes
Week 2: UNIX shell for file manipulation -- outline, lab notes, reading, discussion notes
Week 3: Conda for software installation -- outline, lab notes, reading, discussion notes
Week 4: Snakemake for workflows -- outline, lab notes, reading, discussion notes
Week 5: Project organization and more UNIX shell -- outline, lab notes, reading, discussion notes
Week 6: Git and GitHub for file tracking and sharing -- outline, lab notes, discussion notes
Week 7: Slurm and the Farm cluster for doing analysis -- outline, lab notes, reading
Week 8: R/Rmarkdown for Reports, Documentation and beyond -- outline, lab notes, reading, discussion notes
Week 9: Integrating it all: a sourmash project! -- outline, lab notes, reading, discussion notes
Week 10: Advanced intro UNIX and integration -- outline, lab notes, reading