The official code implementation for Condition-Specific Gene-Gene Attention (
Here, we provide codes for pretraining our network in transcriptome data (including LINCS L1000 dataset) and finetuning our network in cell viability data (including GDSC dataset).
The full model architecture is shown below.
Step 1. Pretraining of condition-specific response on LINCS L1000 dataset
Step 2. Fine-tuning of cell viability response on GDSC dataset
First, clone this repository and move to the directory.
git clone https://github.com/eugenebang/CSG2A.git
cd CSG2A/
To install the appropriate environment for
After installing conda and placing the conda executable in PATH, the following command will create conda environment named csg2a. It will take up to 10 minutes to setup the environment, but may vary upon the Internet connection and package cache status.
conda env create -f environment.yaml && \
conda activate csg2a
Or if you have a virtual environment with adequate pytorch version for your hardware settings including GPU and CUDA, you can install the neccessary packages listed below with pip before running the model.
To check whether
Sample code to fine-tune the IC50 prediction model and evaluate the performance are provided in finetune_GDSC.ipynb.
- The file formats for each input file can be found in here.
Code for pretraining the pretrain_LINCS.ipynb.
We also provide the LINCS L1000-pretrained weights, utilizable for fine-tuning to cell viabilty drug reponse prediction tasks.
We note that we provide two versions; trained on LINCS L1000 Landmark genes (total 978) and trained on LINCS L1000 bing (infered) genes (total 10,167). The experimental results in the manucript is all reported using the Landmark genes.
- Landmark pretrained model (Approx. 170MB; link)
- Bing (infered) pretrained model (Approx. 600MB; link)
Also the pretrained MAT weights for training from scratch can be obtained from the original author's repository.
Operating system
Prerequisites
python=3.10pytorch=2.0.1rdkit=2022.09.5numpy=1.24.1pandas=2.1.1scipy=1.11.3tqdm=4.66.1
The source code of
However, any data or content produced from using
