Analysis workflow for bulk-tcr-beta sequencing data

Bulk TCR-beta chain sequencing workflow - with DNA as starting material

Notion page for detailed documentation

https://www.notion.so/cogen/Running-Bulk-TCR-Sequencing-pipeline-for-translational-lab-members-5bb8a7eb89b04ea68ab601a8e8c7bbab

This includes below steps and scripts

fastp - QC and pre-processing of fastq files/ multiqc to create a multisample report: run_fastp_multiqc.sh
mixcr - alignment and assembly of clonotypes from fastq files: run_mixcr_v1.sh, fix_TRBfiles.R
vdjtools - postprocessing/graphical and text file results for interpretation: run_vdjtools_single_samples.sh, run_vdjtools_custom_samples.sh, vdjtools-patch.sh

Currently fastp, multiqc, mixcr and vdjtools are installed on the galaxy server. But do install fastQC for your user. conda install -c bioconda fastqc

INFO for a test run:

Test data:
~schavan/projects/bulk_tcr_seq/data/EXP21001376_FFPE`

Input files:
~schavan/projects/bulk_tcr_seq/inputs/samplesheet_EXP21001376.tsv
~schavan/projects/bulk_tcr_seq/inputs/metadataToConvert_EXP21001376_FFPE.txt

Output files for Mixcr:
~schavan/projects/bulk_tcr_seq/inputs/

Output files for VDJTools:
~schavan/projects/bulk_tcr_seq/scripts/batch2.2

Inputs

SAMPLSHEET1 create a tab delimited samplesheet per experiment for input to mixcr as below e.g.samplesheet_EXP21001293.tsv. No headers. Specifiy complete and absolute paths to the fastq files

DSCO28-MTC-1    /data/DSCO28-MTC-1_S7_L001_R1_001.fastq.gz    /data/DSCO28-MTC-1_S7_L001_R2_001.fastq.gz
DSCO28-MTC-2    /data/DSCO28-MTC-2_S1_L001_R1_001.fastq.gz    /data/DSCO28-MTC-2_S1_L001_R2_001.fastq.gz
DSCO28-MTC-3    /data/DSCO28-MTC-3_S3_L001_R1_001.fastq.gz    /data/DSCO28-MTC-3_S3_L001_R2_001.fastq.gz
DSCO28-TRF-1    /data/DSCO28-TRF-1_S6_L001_R1_001.fastq.gz    /data/DSCO28-TRF-1_S6_L001_R2_001.fastq.gz
DSCO28-TRF-2    /data/DSO28-TRF-2_S4_L001_R1_001.fastq.gz    /data/DSCO28-TRF-2_S4_L001_R2_001.fastq.gz
DSCO28-TRF-3    /data/DSCO28-TRF-3_S5_L001_R1_001.fastq.gz    /data/DSCO28-TRF-3_S5_L001_R2_001.fastq.gz
FFPE-9G7045 /data/FFPE-9G7045_S2_L001_R1_001.fastq.gz /data/FFPE-9G7045_S2_L001_R2_001.fastq.gz

SAMPELSHEET2 Create another tab delimited samplesheet for preovide mixcr outputs as input to VDJtools as below e.g. metadataToConvert_EXP21001293.txt. Header present. IMPORTANT: Do not specify complete paths, but place the file in the same folder as the inputs folder because VDJtools expects the TRB files to be in the same folder as the inputs folder (Weird bug!) So if needed, create symbolic links in the inputs folder pointing to the output files. Names should be exactly same as the "file_name" in the below file.

	DSCO28-MTC-1analysis.clonotypes.TRB.fixed.txt	DSCO28-MTC-1
	DSCO28-MTC-2analysis.clonotypes.TRB.fixed.txt	DSCO28-MTC-2
	DSCO28-MTC-3analysis.clonotypes.TRB.fixed.txt	DSCO28-MTC-3
	DSCO28-TRF-1analysis.clonotypes.TRB.fixed.txt	DSCO28-TRF-1
	DSCO28-TRF-2analysis.clonotypes.TRB.fixed.txt	DSCO28-TRF-2
	DSCO28-TRF-3analysis.clonotypes.TRB.fixed.txt	DSCO28-TRF-3
	FFPE-9G7045analysis.clonotypes.TRB.fixed.txt	FFPE-9G7045

Outputs

Make sure a file called metadata.txt gets automatically created by VDJtools, looks like below

VDJtools.HNSCC-15396-1.txt	HNSCC-15396-1	conv:MiXcr
VDJtools.HNSCC-15396-2.txt	HNSCC-15396-2	conv:MiXcr
VDJtools.HNSCC-15396-3.txt	HNSCC-15396-3	conv:MiXcr
VDJtools.HNSCC-6827-1.txt	HNSCC-6827-1	conv:MiXcr
VDJtools.HNSCC-6827-2.txt	HNSCC-6827-2	conv:MiXcr
VDJtools.HNSCC-6827-3.txt	HNSCC-6827-3	conv:MiXcr

mixcr

*.TRB.txt
*.clna
*.vdjca

vdjtools (depending on the type of plots, read more in vdjtools documentation)

*.pdf
*.summary.txt
*.txt
metadata.txt

Setting up your own user conda environment

Logon to Galaxy server and then issue the below commands:

conda create --name tcrbeta
conda activate tcrbeta

The above creates and activates a conda environment called "tcrbeta" for you, then you can install R libaries using conda install commands for specific R libraries like ggplot etc inside this "tcrbeta" so that this setup remains specific to tcrseq only and does not ever conflict with anything else you might use your bash for.

Install R/The version that I've is R 4.0.5

conda install -c conda-forge r-base

Install R libraries

conda install -c conda-forge r-ggplot2
conda install -c conda-forge r-gplots
conda install -c conda-forge r-rcolorbrewer
conda install -c conda-forge r-VennDiagram
conda install -c conda-forge r-reshape2
conda install -c conda-forge r-ape
conda install -c conda-forge r-plotrix

Install any missing R library in the above way. And, if you cannot find any library with channel conda-forge, try channel "-c bioconda" instead of "-c conda-forge"

For more understadning read: Overall for more understanding read

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
(Example) metadataToConvert.txt		(Example) metadataToConvert.txt
(Example) samplesheet_EXP21001376.tsv		(Example) samplesheet_EXP21001376.tsv
1_run_fastp_multiqc.sh		1_run_fastp_multiqc.sh
2_run_mixcr_v1.sh		2_run_mixcr_v1.sh
3_fix_TRBfiles.R		3_fix_TRBfiles.R
4_run_vdjtools_single_samples.sh		4_run_vdjtools_single_samples.sh
5_run_vdjtools_custom_overlap_samples.sh		5_run_vdjtools_custom_overlap_samples.sh
README.md		README.md
mergelanes.sh		mergelanes.sh
super-relevant-reads.txt		super-relevant-reads.txt
vdjtools-patch.sh		vdjtools-patch.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis workflow for bulk-tcr-beta sequencing data

Notion page for detailed documentation

Inputs

Outputs

Setting up your own user conda environment

About

Releases

Packages

Languages

ShwetaCh/bulk-tcr-beta

Folders and files

Latest commit

History

Repository files navigation

Analysis workflow for bulk-tcr-beta sequencing data

Notion page for detailed documentation

Inputs

Outputs

Setting up your own user conda environment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages