UnifyImmun: a unified cross-attention model for prediction of antigen binding specificity

UnifyImmun is an advanced computational model that predicts the binding specificity of antigens to both HLA and TCR molecules. By employing a unified cross-attention mechanism, UnifyImmun provides a comprehensive evaluation of antigen immunogenicity, which is crucial for the development of effective immunotherapies.

Web Server: http://hliulab.tech/unifylmmun/

Key features

Unified model: Simultaneously predicts peptide bindings to both HLA and TCR molecules.
Cross-attention mechanism: Integrates the features of peptides and HLA/TCR molecules for model interpretability.
Progressive training strategy: Utilizes a two-phase progressive training to improve feature extraction and model generalizability.
Virtual adversarial training: Enhances model robustness by training on perturbed data.
Superior performance: Outperforms existing methods on both pHLA and pTCR prediction tasks on multiple datasets.

For inquiries or collaborations, please contact: hliu@njtech.edu.cn

System requirements

Linux version: 4.18.0-193 (Centos confirmed)
GPU: NVIDIA GeForce RTX 4090 (or compatible GPUs)
CUDA Version: 12.4
Python: 3.10
PyTorch: 2.2.1 (model implementation)

Installation guide

Clone the UnifyImmun repository

git clone https://github.com/hliulab/unifyimmun.git

Enter UnifyImmun project folder

cd unifyimmun/

Set up the Python environment and install the required packages

pip install -r requirements.txt

Instructions for use

The training data for pHLA and pTCR bindings is stored in the data folder. The source code of UnifyImmun model, as well as the training and testing scripts, are included in the source folder. The trained models are stored in the trained_model folder.

Input data format

The input data should be a CSV file with three columns named tcr, peptide, and HLA, representing the TCR CDR3 sequence, peptide sequence, and HLA sequence, respectively.

Model training

For the convenience of sequentially running all the training steps, you can use the provided Python script named run_all_phases.py. After ensuring that the required environment and dependencies are installed, execute the following code:

cd source

python run_all_phases.py

Model testing

Given the fine-tuned model or our trained model (saved in trained_model folder), you can evaluate it on our demo test sets using the following scripts.

Predict HLA binding specificity using pHLA test set

cd source

python HLA_test.py

Evaluate TCR binding specificity using pTCR test set

cd source

python TCR_test.py

Output predicted scores

Given the fine-tuned model or our trained model (saved in trained_model folder), you can output predicted scores for the demo test sets using the following scripts.

Output predicted scores for HLA binding specificity using pHLA test set

cd source

python HLA_output_score.py

Output prediction scores for TCR binding specificity using pTCR test set

cd source

python TCR_output_score.py

In our practice, the time overhead required to run these two demos above is about 2 minutes when batch_size=8192.

Hyperparameter adjustment

If transfer the model using your custom dataset, you may need to adjust the hyperparameters within the Python scripts. Hyperparameters include learning rate, batch size, number of epochs, and other model-specific parameters.

Note: Ensure that the file paths and script names provided in the commands match those in your project directory. The source/ directory and script names like HLA_test.py and TCR_test.py are placeholders and should be replaced with the actual paths and filenames used in your implementation.

Customizing output

To customize the output results, users can modify the parameters within each script. Detailed comments within the code provide descriptions and guidance for parameter adjustments.

Support

For further assistance, bug reports, or to request new features, please contact us at hliu@njtech.edu.cn or open an issue on the GitHub repository page.

Please replace the placeholder links and information with actual data when the repository is available. Ensure that the instructions are clear and that the repository contains the requirements.txt file with all necessary dependencies listed.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.idea		.idea
data		data
source		source
trained_model		trained_model
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

UnifyImmun: a unified cross-attention model for prediction of antigen binding specificity

Web Server: http://hliulab.tech/unifylmmun/

Key features

System requirements

Installation guide

Instructions for use

Input data format

Model training

Model testing

Output predicted scores

Hyperparameter adjustment

Customizing output

Support

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

hliulab/UnifyImmun

Folders and files

Latest commit

History

Repository files navigation

UnifyImmun: a unified cross-attention model for prediction of antigen binding specificity

Web Server: http://hliulab.tech/unifylmmun/

Key features

System requirements

Installation guide

Instructions for use

Input data format

Model training

Model testing

Output predicted scores

Hyperparameter adjustment

Customizing output

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages