Skip to content

Commit 31d98a9

Browse files
committed
More files related to the energy calculations added to the repo.
1 parent 11773f9 commit 31d98a9

File tree

5 files changed

+102
-0
lines changed

5 files changed

+102
-0
lines changed

README.txt

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
-------------------------------------- Outline of the program ---------------------------------------------
2+
Main script to run:
3+
make.bash
4+
5+
Supplementary scripts:
6+
1) main.py
7+
2) getSmilesFromFile.py
8+
3) xyzFromSmiles.py
9+
4) getScores.tcsh
10+
5) call_combine.tcsh
11+
6) combine.py
12+
7) run.slurm inside a directory named energy_protein
13+
8) getEnergy.py
14+
9) run_template.slurm
15+
16+
Files and directories needed from the user:
17+
1) A file named "drugs.txt" which contains the initial data from the website given in the problem.
18+
2) A file named "3sxr_dasatinib_removed.pdb" containing the protein without dasatinib. I could have
19+
removed this user input but kept it this way.
20+
3) A directory named "energy_protein" that contains a slurm script run.py which would submit the job
21+
for protein.py such as run.slurm
22+
4) A template file named 'prm-template.prm' for the prm input file to be used for docking calculations.
23+
24+
-------------------------------------- Summary of the program ---------------------------------------------
25+
All of the scripts contain comments to clarify their purpose in the program. A brief summary is provided below:
26+
1) The bash script main.bash calls main.py.
27+
- main.py extracts the smiles formats for the drugs from the file called drugs.txt.
28+
- A directory structure is created for the docking calculations which looks like:
29+
|---- rDock_inputs/
30+
| |----MoleculeName/
31+
- xyz coordinates are generated from the smiles using the script xyzFromSmiles
32+
- .prm input files are generated from the prm-template.prm and are stored as moleculeName_rdock.prm as:
33+
|---- rDock_inputs/
34+
| |----MoleculeName/
35+
| | |----moleculeName_rdock.prm
36+
2) A 10 runs-per-ligand rDock job is submitted for each drug molecule.
37+
3) getScores.tcsh is called to extract the scores from the docking output files.
38+
4) The scores and the molecule names are put in a file and molecule are sorted according to their docking scores.
39+
5) A shell script call_combine is called which submits slurm jobs for binding energy calculations.
40+
6) Energy calculations are submitted for the protein without ligand.
41+
7) A python script getEnergy.py is called to extract energies and return binding energies in kcal/mol.
42+
8) Molecules are sorted according to their binding energies to get their ranking.
43+
__________________________________________________________________________________________________________
44+

energy_protein/protein.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
import os, glob, sys
2+
from rdkit import Chem
3+
from multiprocessing import Pool
4+
from rdkit import Chem
5+
from rdkit.Chem import AllChem
6+
import numpy as np
7+
8+
enzymeFile = '../3sxr_dasatinib_removed.pdb'
9+
m = Chem.rdmolfiles.MolFromPDBFile(enzymeFile)
10+
m = Chem.AddHs(m, addCoords=True)
11+
res = AllChem.MMFFOptimizeMoleculeConfs(m, maxIters=0, numThreads=0)
12+
np.savetxt('energies.txt', [res[0][1]])
13+
14+

energy_protein/run.slurm

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#!/bin/bash
2+
##### Constructed by HPC everywhere #####
3+
#SBATCH --mail-user=kumaranu@iu.edu
4+
#SBATCH --nodes=1
5+
#SBATCH --ntasks-per-node=1
6+
#SBATCH --cpus-per-task=48
7+
#SBATCH --time=0-3:59:00
8+
#SBATCH --mem=58gb
9+
#SBATCH --partition=general
10+
#SBATCH --mail-type=FAIL,BEGIN,END
11+
#SBATCH --job-name=my_job
12+
13+
###### Module commands #####
14+
module unload python
15+
module load anaconda/python3.8/2020.07
16+
17+
conda activate docking-rdock
18+
19+
###### Job commands go below this line #####
20+
python protein.py
21+

energy_protein/slurm-2397773.out

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
anaconda version 2020.07 loaded.

run_template.slurm

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
#!/bin/bash
2+
##### Constructed by HPC everywhere #####
3+
#SBATCH --mail-user=kumaranu@iu.edu
4+
#SBATCH --nodes=1
5+
#SBATCH --ntasks-per-node=1
6+
#SBATCH --cpus-per-task=48
7+
#SBATCH --time=0-3:59:00
8+
#SBATCH --mem=58gb
9+
#SBATCH --partition=general
10+
#SBATCH --mail-type=FAIL,BEGIN,END
11+
#SBATCH --job-name=XX
12+
13+
###### Module commands #####
14+
module unload python
15+
module load anaconda/python3.8/2020.07
16+
17+
conda activate docking-rdock
18+
19+
###### Job commands go below this line #####
20+
python combine.py XX
21+
22+

0 commit comments

Comments
 (0)