Skip to content
/ wm_training_202205 Public template

Theiagen Genomics repository for workflow management training

License

Notifications You must be signed in to change notification settings

theiagen/wm_training_202205

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Workflow Management Solutions for Public Health Bioinformatics

Theiagen Genomics repository for workshop resources, code templates, and exercise materails

Course Overview

This intermediate-level bioinformatics training workshop will provide conceptual and applied training for understanding and utilizing workflow management solutions (e.g. WDL and Nextflow) for interoperable, reproducible, and accessible genomic analysis.

Length of Program

This workshop is designed as a virtual, 4-week series of live-lectures and hands-on exercises. For registered trainees, instructors will be made available for office hours and continued support throughout the duration of this course. All materials--slides, exercises, and recorded lectures--will be made available publicly accessible.

Objectives

At the conclusion of this course, participants will be able to:

  • Understand fundamental concepts, advantages, and disadvantages behind containerization and workflow management systems
  • Analyze and assess WDL and Nextflow code bases
  • Utilize WDL and Nextflow workflow management systems to integrate multiple analytical modules into a single bioinformatics pipeline
  • Publish custom workflows to the Dockstore pipeline repository for integration on the Terra.Bio web application
  • Launch Nextflow pipeline on the Nextflow Tower platform

Target Audience

This course is meant for public health bioinformatics scientists with experience accessing and interacting with open-source bioinformatics software through a command-line interface (CLI), version control systems such as Git, and familiarity with the concepts of containerized software systems such as Docker or Singularity. Registered Github accounts are encouraged for the completion of all planned exercises.

Below is a list of helpful resources that we recommend all trainees review, at least in part, prior to the start of this training workshop (listed in order of highest priority):

Course Content

Slides & Exercises

Week 1: Introduction to Workflow Management Using WDL

Week 2: Closer Look at WDL Tasks and Workflows

Week 3: Connecting WDL Workflows with Terra.Bio

Week 4: Getting Started with Conda and Nextflow

Exercise Resource Requirements

  • Google Cloud Platform Virtual Machines (GCP VMs) with all pre-requisite software installed will be provisioned to all registered trainees. For those interested in recreating this training with their own compute environment, here is a list of resources required for the completion of each exercise:

Note: All exercises were developed to run on e2-standard-4 GCP VMs (4 CPUs; 16GB RAM) running Ubuntu 20.04.4 LTS (Focal Fossa)

About

Theiagen Genomics repository for workflow management training

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published