tags |
---|
ggg, ggg2024, ggg201b |
Log into farm with ssh, and use srun to get a compute node:
srun -p high2 --time=3:00:00 --nodes=1 --cpus-per-task=4 --mem=5GB --pty /bin/bash
Now, update the repo:
cd ~/2024-ggg-201b-rnaseq/
git pull
::::spoiler Do this if you didn't clone the github directory last time:
Run:
cd ~/
git clone https://github.com/ngs-docs/2024-ggg-201b-rnaseq
cd ~/2024-ggg-201b-rnaseq
::::
Load mamba:
module load mamba
mamba activate quant
and run quantification pipeline:
cd ~/2024-ggg-201b-rnaseq
snakemake -j 4 -p
Next, install a notebook containing RNAseq software in R:
mamba env create -n rnaseq -f environment.yml
Q: what does this install? Take a look at environment.yml.
(We'll be using DESeq2 for analysis of bulk RNAseq data.)
Now run:
module load rstudio-server
rstudio-launch
and set up your connection as normal.
::::spoiler If you are running in ondemand:
You'll need to start a new RStudio session using the rnaseq
conda environment.
:::
Using the file browser, go to the 2024-ggg-201b-rnaseq
directory in your home directory.
Click on the rnaseq-workflow.Rmd
.
Now click on "Knit".
Wait a minute or two, and check out the popup window.
Questions to address!
- What is RMarkdown?
- What does this notebook do?
- What is the output of the quantification pipeline, and how does it get loaded into this notebook?
- What are the outputs of this notebook??
Statistics questions:
- what is the difference between pvalue and pAdj, and which should you use?
- what do we do with the outlier on the MDS plot?
- why does the MA plot look the way it does? What are the striations, and why is there a funnel?
Relevant: green jelly beans