This pipeline is built for high-throughput generation of homology-models of a sets of enzymes with only the sequence information.
The modeling method is based on the Basic Modeling tutorial of Modeller, and the template selection rule is just to select the structure with the highest sequence identity to the target sequence.
The default e-value cutoff is 0.01.
If no template is found,the target id will write to non_template.txt
.In this case,you should use other methods(such as hhblits) to search for remote templates.
Biopython==1.78
Modeller==9.25
- All PDB sequences:
- pdball.pir.gz
https://salilab.org/modeller/supplemental.html
Your workspace should look like this:
The input.fasta file can contain multiple sequence.
.
├── input.fasta
├── align2d.py
├── build_profile.py
├── download_pdbchain.py
├── model_single.py
├── pdball.bin
├── rename.sh
└── start-modeling.sh
0 directories, 8 files
start automated modelling:
bash start-modeling.sh input.fasta