Computational Workflow and Analysis Code for Anchor Project

Code for computational workflows and analyses relating to "Computational prediction of MHC anchor locations guide neoantigen identification and prioritization"

Initial_peptide_database.ipynb
- Combining all input data and selecting HLA alleles and their corresponding strong binding peptides
Saturation_analysis.ipynb
- Saturation analysis using HLA-A*02:01
fasta_generator.py
- Generating FASTA files for input into pVACbind
pvacbind_run.sh
- Running pVACbind in parallel
Anchor Position Calculation.ipynb
- Collecting pVACbind results and calculating anchor probabilities
Anchor_cluster_analysis.ipynb
- Summarizing anchor trends using hierarchical clustering and heatmaps

Orthogonal validation with crystallography structures

Validation_pMHC_crystallography_analysis.ipynb
- Use of Mdtraj package to calculate distance and SASA for peptide-MHC pdb structures
- Comparisons of predictions from structure data to our own predictions
TCR validation data analysis.ipynb
- Repeat of the structure analysis using TCR-peptide-MHC pdb structures

Evaluating Anchor Impact

Impact Analysis TCGA samples.ipynb
- Selection of a balanced HLA population from remaining TCGA samples
- Generating FASTA files and running pVACbind
- Objective determination of anchor locations
- Analyzing the entire cohort using three different filters (no anchor, conventional anchor and allele-specific anchor)
Impact analysis using different binding cutoffs.ipynb
- Repeating analysis using different binding cutoffs and inclusion criteria

Validation peptide selection and analysis

Generation of experimental validation candidates.ipynb
- Anchor calculation performed for all good binding candidates
- Selecting peptides for experimental validation
- Prioritization of mutations and positions for validation experiments
Validation Plots.ipynb
- Evaluation of in vitro and in vivo experimental results

Additional analyses

Comparison between seed dataset and other random peptide sets.ipynb
- Evaluating seed peptide source by generating random peptide sequences from 3 different sources and repeating the analysis
Reviewer response analysis (HLA distribution).ipynb
- Bias analysis for HLA allele specific anchor patterns
Reviewer response - Scenario count.ipynb
- Determining how many SNVs fell into each scenario

Resources

For researchers wanting to incorporate our end results into their pipelines:
- Normalized anchor scores are available in the supplemental materials of original paper and also available under Datasets in this github repository.
- Our compiled seed dataset (containing peptide sequences, hla allele and all 8 binding algorithm outputs) are also available under Datasets.
For researchers looking to expand this database for particular HLA alleles, we recommend the following steps:
- Identify strong binding peptides for the HLA allele(s) and peptide length(s) of interest.
- Generate a dictionary of peptides where each position is mutated to all possible amino acids.
- Use that dictionary to generate a FASTA file in the format required by pVACbind (www.pvactools.org).
- Run pvacbind in parallel across different HLA allele(s) and peptide length(s).
  - Note that you will likely have to run each combination in a separate command (we provide the scripts we used on our own cluster for your adaptation).
- Assemble prediction results and calculate the anchor scores for each position of each peptide (please refer to helper functions in Anchor Position Calculation.ipynb).
- This process can be done on a individual peptide-HLA combination basis but you can also aggregate and average across multiple peptides (for the same length for the same HLA allele )for an overall score.

License

The project is licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Datasets		Datasets
Python Scripts		Python Scripts
Shell Scripts		Shell Scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computational Workflow and Analysis Code for Anchor Project

Table of Contents

Computational Prediction of Anchor locations

Orthogonal validation with crystallography structures

Evaluating Anchor Impact

Validation peptide selection and analysis

Additional analyses

Resources

License

Stable release with DOI

About

Releases 1

Packages

Contributors 2

Languages

License

griffithlab/anchor_huiming_etal_2023

Folders and files

Latest commit

History

Repository files navigation

Computational Workflow and Analysis Code for Anchor Project

Table of Contents

Computational Prediction of Anchor locations

Orthogonal validation with crystallography structures

Evaluating Anchor Impact

Validation peptide selection and analysis

Additional analyses

Resources

License

Stable release with DOI

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages