ActiveTCR: An Active Learning Framework for Cost-Effective TCR-Epitope Binding Affinity Prediction

ActiveTCR is a unified framework designed to minimize the annotation cost and maximize the predictive performance of T-cell receptor (TCR) and epitope binding affinity prediction models. It incorporates active learning techniques to iteratively search for the most informative unlabeled TCR-epitope pairs, reducing annotation costs and redundancy. By leveraging four query strategies and comparing them to a random sampling baseline, ActiveTCR demonstrates significant cost reduction and improved performance in TCR-epitope binding affinity prediction. ActiveTCR is the first systematic investigation of data optimization in the context of TCR-epitope binding affinity prediction.

Publication

An Active Learning Framework for Cost-Effective TCR-Epitope Binding Affinity Prediction
Pengfei Zhang^1,2, Seojin Bang³, Heewook Lee^{1,2, *}
¹School of Computing and Augmented Intelligence, Arizona State University, ²Biodesign Institute, Arizona State University, ³Google DeepMind
Accepted for publication: IEEE BIBM 2023

Paper | Code | Poster | Slides | Presentation (YouTube)

Major results of ActiveTCR

Use case a: reducing more than 40% annotation cost for unlabel TCR-epitope pools.

Use case b: minimizing more than 40% redundancy among already annotated TCR-epitope pairs.

Dependencies

Linux
Python 3.6.13
Keras 2.6.0
TensorFlow 2.6.0

Steps to train a Binding Affinity Prediction model for TCR-epitope pairs.

1. Clone the repository

git clone https://github.com/Lee-CBG/ActiveTCR
cd ActiveTCR/
conda create --name bap python=3.6.13
pip install -r requirements.txt
source activate bap

2. Prepare TCR-epitope pairs for training and testing

Download training and testing data from datasets folder.
Obtain embeddings for TCR and epitopes following instructions of catELMo. Or directly download embeddings from Dropbox.

3. Train and test models

An example for use case a of ActiveTCR: reducing annotation cost for unlabeled TCR-epitope pools.

python -W ignore main.py \
                --split epi \
                --active_learning True \
                --query_strategy entropy_sampling \
                --train_strategy retrain \
                --query_balanced unbalanced \
                --gpu 0 \
                --run 0

An example for use case b of ActiveTCR: minimizing redundancy among labeled TCR-epitope pairs.

python -W ignore main.py \
                --split epi \
                --query_strategy entropy_sampling \
                --train_strategy retrain \
                --query_balanced unbalanced \
                --gpu 1 \
                --run 0

Citation

If you use this code or use our catELMo for your research, please cite our paper:

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
datasets		datasets
figures		figures
supp		supp
LICENSE-CC-BY-NC-ND		LICENSE-CC-BY-NC-ND
README.md		README.md
main.py		main.py
nns.py		nns.py
query.py		query.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ActiveTCR: An Active Learning Framework for Cost-Effective TCR-Epitope Binding Affinity Prediction

Publication

Major results of ActiveTCR

Dependencies

Steps to train a Binding Affinity Prediction model for TCR-epitope pairs.

1. Clone the repository

2. Prepare TCR-epitope pairs for training and testing

3. Train and test models

Citation

License

About

Releases

Packages

Languages

License

Lee-CBG/ActiveTCR

Folders and files

Latest commit

History

Repository files navigation

ActiveTCR: An Active Learning Framework for Cost-Effective TCR-Epitope Binding Affinity Prediction

Publication

Major results of ActiveTCR

Dependencies

Steps to train a Binding Affinity Prediction model for TCR-epitope pairs.

1. Clone the repository

2. Prepare TCR-epitope pairs for training and testing

3. Train and test models

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages