Skip to content

Latest commit

 

History

History
43 lines (29 loc) · 6.05 KB

File metadata and controls

43 lines (29 loc) · 6.05 KB

Dry bench skills for Researchers

Topics of Interest

Command Line Skills Reading and Plotting Data in R Why Git and GitHub Forking in GitHub
Git Add Git Commit Nextflow Nextflow patterns Finding Data
Conda for managing dependencies Docker Build Test Share Reuse Getting End-to-End Example Data GitHub Hello World Fun
SRA and RNAseq Variant Calling Sarek End-to-End Example Dry Bench Skills Recap
Using Zenodo Long Read Proteogenomics The Impact of Sex on Algernative Splicing (rMATS, Papermill, JupyterLab notebooks)

ABOUT

The practice of biological inquiry has evolved so that a biological researcher's training must include both wet-lab and dry lab experimental and computational approaches. This course seeks to lay a broad foundation for the dry bench researcher. It is clear that "data literacy is a key skill in the modern world." The pilot course held in December 2020 introduced the trainees to the computational techniques that enable data analysis locally on their laptop and as well as how to enable these same analysis techniques in the cloud.

The training continues through a series of mini-courses held virtually to provide learners with critical skills in effective collaboration imperative in a world where the data available for analysis is often beyond a single lab's walls.

Initially, in the December Pilot course 10 hours of instruction, spread over two weeks introduced the learner to RNAseq analysis and sought to demystify and make accessible the steps RNAseq analysis. By the end of the course, the learner was able to use the tools involved in these analyses collaboratively and independently. The course aimed to go broad and point the learner to learn more and take their knowledge deeper. Through their participation in the class, learners acquired experience in a full suite of tools essential to any bioinformatics analysis. They will learn and understand FAIR (findability, accessibility, interoperability and reusability) best practices. GitHub, Docker, and ORCID ids were obtained if not already in hand. Learners found Nextflow workflows and ran them on Lifebit's CloudOS generating results which were further explored through interactive JupyterLab notebooks. Learners used SRAExplorer and other tools to access existing publicly available datasets using cloud resources, learning where to search for ever-growing cloud accessible datasets that may aid their research. Learners gained insight regarding best practices in using flexible, powerful, and community-based workflow languages such as Nextflow and how these workflows use containerization for modularization and simplification of bioinformatic analysis. Such workflows will be run using cloud-accessible data in the cloud and on their laptop.

Material from the original course and a growing number of other courses are found now in this repository.

TOOLS

Pre-course materials as well as in-class exercises will cover the following topic areas:

  • Bash
  • R
  • Git/GitHub
  • JupyterLab Notebooks
  • Conda
  • Docker
  • Nextflow

PLATFORM

Learners will use Lifebit's CloudOS platform for this course. Requirements include the availability of a laptop, as they will then have the ability to log into a browser to access the course materials (Chrome preferred).

INSTRUCTORS

  • Anne Deslattes Mays, Ph.D., Science and Technology Consulting LLC
  • Christina Chatzipantsiou, Lifebit