Replies: 4 comments
-
Lapack is dealing with those over scipy. I guess it will be due to diagonalizer. Currently, we have an effort to build gpu based eigenvalue decomposition for large molecules with magma. With 1.9, it will be online. |
Beta Was this translation helpful? Give feedback.
-
This is still in progress. There's a GPU implementation pretty much complete but not integrated yet. |
Beta Was this translation helpful? Give feedback.
-
You could maybe try using Dask. This package handles much larger arrays and splits the computations into parallel units. They have a singular value decomposition (https://docs.dask.org/en/latest/array-api.html?highlight=linalg#dask.array.linalg.svd) that you could use for the matrix decomposition maybe. I have also thought about incorporating it, but never got a chance. Alternatively, you could try coarse-graining further. One option could be to take ever second or third residue or even larger gaps (see https://pubmed.ncbi.nlm.nih.gov/11913377/ where this seems to work up to 1/40 of residues using a shifting scheme) or take the average position of every few residues. Another option is more rigorous Markov methods for hierarchical coarse-graining (https://pubmed.ncbi.nlm.nih.gov/17691893/) although this might be more challenging. Another alternative could be to use our new interface CryoDy to obtain a coarse grained elastic network directly from a cryo-EM map (see http://prody.csb.pitt.edu/tutorials/cryoem_tutorial/). Best wishes |
Beta Was this translation helpful? Give feedback.
-
I was just trying this yesterday. I think the best option is to use buildHessian with sparse=True and calcModes with turbo=False. It may also help to limit the number of cpus. What I did was the following:
I've found that limiting the number of threads can also speed up the calculations sometimes |
Beta Was this translation helpful? Give feedback.
-
Dear all,
I am trying to perform ANM calculations on a tetrameric membrane protein (PDB ID 3J8H, chains A,C,E,G) using only CA atoms, which amount a total of 14640 atoms. In our unversity we have a computer cluster with up to 1 TB of RAM, so for testing purposes I am booking up to 800 Gb, but the jobs crashes producing a segmentation fault (and core dump). Here it is the script:
from prody import *
import matplotlib as mpl
mpl.use('Agg')
import matplotlib.pylab as plab
import matplotlib.pyplot as plt
import numpy as np
RyR_closed = parsePDB('3J8H_ACEG.pdb')
RyR_closed = RyR_closed.select('protein and name CA')
compute ANM modes
anm = ANM('RyR closed ANM analysis')
anm.buildHessian(RyR_closed)
anm.calcModes()
write Modes to be represented by VMD
to be read by parseNMD subroutine
writeNMD('RyR_closed.ANM.nmd',anm,RyR_closed)
quit()
And here is the error:
@> 107796 atoms and 1 coordinate set(s) were parsed in 1.23s.
@> Hessian was built in 65.85s.
/netscr/EX/lsf/killdevil/lsbatch/1420483745.214510: line 8: 12861 Segmentation fault (core dumped) ./prody_anm.CA.py
However I know that the maximum memory used is about 29 Gb:
Exited with exit code 139.
Resource usage summary:
I see two possible problems (not being a python programer):
Python memory allocation for arrays is predetermined to a size below the one needed in the diagonalization process (Hessian computation is made extremely fast)
Or perhaps it is known that LAPACK/BPACK subroutines used as diagonalizer (the ones in C) are not able to handle this array size.
Can anyone help me to perform ANM on large systems ?
Thank you so much,
Raul
Beta Was this translation helpful? Give feedback.
All reactions