This repository provides a graph neural network–based transfer learning pipeline to predict the activity of compounds on the Estrogen Receptor (ER) using the ToxCast dataset.
cd model- Python 3.9 or higher
- Install dependencies:
pip install -r requirements.txt
data/– Contains data loaders that convert SMILES strings into graph representations and split them into training and evaluation setsmodels/– Implements GIN, GCN, GAT, and their hybrid variantstrain/– Training routines such as pretrain.py, finetune.py, and target_only.pyconfig/– The config.py file where data paths and hyperparameters are definedrun_single_pipeline.sh– A shell script that runs the full pretraining and fine-tuning pipeline for a single source/target combinationlauncher.sh– A Slurm launcher script to execute multiple combinations in parallel on an HPC cluster
-
After installing dependencies, set the environment variables:
export SOURCE_NAME=TOX21_ERa_LUC_VM7_Agonist export TARGET_NAME=ATG_ERE_CIS
-
Choose a model and run the pipeline. Supported models: GIN, GCN, GAT, GIN_GCN, GIN_GAT, GCN_GAT.
python main.py --model GIN
or (if you use Slurm)
bash launcher.sh
When finished, the trained model will be saved under model/model_save/.
export SOURCE_NAME=TOX21_ERa_LUC_VM7_Agonist export TARGET_NAME=ATG_ERE_CIS
python main.py --model GINbash launcher.sh
This project is licensed under the terms of the MIT License.