This repository contains a trained Graph Neural Network (GNN)-based pipeline for detecting Nuclear Export Signals (NES) from protein 3D structures in .pdb format.
- A protein
.pdbfile containing 3D structure, with NES (chain B) and surrounding regions.
- Classification result:
NES POSITIVEorNES NEGATIVE
- Confidence score (value between 0 and 1)
Ensure you have the following installed:
- Python ≥ 3.8
torch,torch_geometric,biopython,sklearn,matplotlib,tqdm
To install dependencies:
pip install -r requirements.txt-
Place your
.pdbfile under any directory (e.g.examples/my_protein.pdb). -
Run the classifier:
python user_main.py examples/my_protein.pdb- Example output:
Prediction: NES POSITIVE
Confidence score: 0.843
This tool uses a Graph Neural Network (GNN) to classify proteins based on the spatial proximity and amino acid types of their residues.
-
Graph Construction: Each residue becomes a node, and edges are built between nearby residues (within 8Å). Only chain B (NES region) and its surrounding residues (within 15Å) are included in the graph.
-
Node Features: Each node has a one-hot vector representing the amino acid type and a binary flag indicating if it belongs to the NES chain.
-
Model: The default model is EGNN (Equivariant GNN), trained on labeled positive/negative NES proteins. GCN is also supported.
To retrain or experiment with parameters, run:
python run.pyYou can modify run.py to set:
batch_sizeepochslearning_ratehidden_dimdropout, etc.
Model checkpoints are saved under Hackaton/.
After training, the model generates:
ROC Curve:Hackaton/roc_curve.pngBoxplot:Hackaton/boxplot.png
These help visualize model separation power between NES-positive and negative samples.
This model was developed as part of a protein bioinformatics Hackathon for NES signal detection using 3D structural information and deep learning.