SelfPAD:

Author: Talip Ucar (ucabtuc@gmail.com)

The official implementation of Improving Antibody Humanness Prediction using Patent Data

Model

Pre-training	Fine-tuning

Environment

We used Python 3.7 for our experiments. The environment can be set up by following three steps:

pip install pipenv             # To install pipenv if you don't have it already
pipenv install --skip-lock     # To install required packages. 
pipenv shell                   # To activate virtual env

If the second step results in issues, you can install packages in Pipfile individually by using pip i.e. "pip install package_name".

Configuration

There are two types of configuration files:

1. pad.yaml         # Defines parameters and options for pre-training
2. humanness.yaml   # Defines parameters and options for fine-training

Training and Evaluation

You can train and evaluate the model by using:

python selfpad_pretrain.py        # For pre-training
python selfpad_finetune.py        # For fine-tuning it for humanness
python selfpad_eval.py -ev test    # To compute humanness score for custome dataset, in this case it is test.csv. CSV file should have "VH", "VL" and/or "Label" columns

Structure of the repo

- selfpad_pretrain.py
- selfpad_finetune.py
- selfpad_eval.py

- src
    |-selfpad.py
    |-selfpad_humanness.py

- config
    |-pad.yaml
    |-humanness.yaml
    
- utils_common
    |-arguments.py
    |-utils.py
    |-tokenizer.py
    ...
    
- utils_pretrain
    |-load_data.py
    |-model_utils.py
    |-loss_functions.py
    ...
    
- utils_finetune
    |-load_data.py
    |-model_utils.py
    |-loss_functions.py
    ...
    
- data
    |-test.csv
    ...
    
- results
    |-pretraining
    |-humanness
    ...

Results

Results at the end of training is saved under ./results directory. Results directory structure is as following:

- results
    |-task e.g. humanness, or pretraining
            |-evaluation
                |-clusters (for plotting t-SNE and PCA plots of embeddings)
            |-training
                |-model
                |-plots
                |-loss

You can save results of evaluations under "evaluation" folder.

Experiment tracking

You can turn on Weight and Biases (W&B) in the config file for logging

Citing the paper

@article{ucar2024SelfPAD,
  title={Improving Antibody Humanness Prediction using Patent Data},
  author={Ucar, Talip and 
          Ramon, Aubin and 
          Oglic, Dino and 
          Croasdale-Wood, Rebecca and 
          Diethe, Tom and 
          Sormanni, Pietro},
  journal={arXiv preprint arXiv:2110.04361},
  year={2024}
}

Citing this repo

If you use SelfPAD framework in your own studies, and work, please cite it by using the following:

@Misc{talip_ucar_2024_SelfPAD,
  author =   {Talip Ucar},
  title =    {{Improving Antibody Humanness Prediction using Patent Data}},
  howpublished = {\url{https://github.com/AstraZeneca/SelfPAD}},
  month        = January,
  year = {since 2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SelfPAD:

Author: Talip Ucar (ucabtuc@gmail.com)

Table of Contents:

Model

Environment

Configuration

Training and Evaluation

Structure of the repo

Results

Experiment tracking

Citing the paper

Citing this repo

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
assets		assets
config		config
data		data
src		src
utils_common		utils_common
utils_finetune		utils_finetune
utils_pretrain		utils_pretrain
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
README.md		README.md
requirements.txt		requirements.txt
selfpad_eval.py		selfpad_eval.py
selfpad_finetune.py		selfpad_finetune.py
selfpad_pretrain.py		selfpad_pretrain.py

License

AstraZeneca/SelfPAD

Folders and files

Latest commit

History

Repository files navigation

SelfPAD:

Author: Talip Ucar (ucabtuc@gmail.com)

Table of Contents:

Model

Environment

Configuration

Training and Evaluation

Structure of the repo

Results

Experiment tracking

Citing the paper

Citing this repo

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages