This repository contains the codes of Jiazheng Miao's (MBI Class of 2025) Capstone Project, a mathematical modeling on the expression level of espA in Mycobacterium tuberculosis.
Execute the scripts according to the following order:
sbatch_download.sh
: Download data listed indna_accession.txt
andrna_accession.txt
sbatch_dnaseq_pe.sh
: Process paired-ended DNA-seq datasbatch_dnaseq_se.sh
: Process single-ended DNA-seq datasbatch_rnaseq.sh
: Process RNA-seq dataorganize_data.py
: AggregateVCF
files to aCSV
file, convert RNA read counts to LogFKPM, and screen for RD8/RD236a deletionsmerge_replicates.R
: Combine identified variants and average the expression level across the technique replicatespca.R
: Decompose the variant matrix (espA regulatory region excluded) into PCsmodeling.Rmd
: Perform the modeling