You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For ~200 ~6Mb bacteria genomes, the neighborhood based paralog splitting step alone is taking over 24 hours on a c5.2xlarge EC2 instance, while the previous steps finished in a timely fashion. Notably the CPU usage for the entire period is very low (less than 1%), while memory usage remains fairly constant at 40%, indicating some sort of CPU bottleneck.
The text was updated successfully, but these errors were encountered:
Hi, thank you for the report. This is certainly much much slower than my tests. According to your text, this is most likely to have a bottleneck in the I/O.
PEPPA writes and reads lots of data from the file system. This does not seem to be an issue in my test, even when I used a mounted netdrive. But I have not tested it in an AWS instance yet. I have updated PEPPA a little bit to optimize its I/O performance. However, please do not expect too much.
Thanks, I appreciate the prompt support. Perhaps you could add some sort of debugging capability so that the issue can be isolated? I'm not eager to run something for hours and not get an answer.
For ~200 ~6Mb bacteria genomes, the neighborhood based paralog splitting step alone is taking over 24 hours on a c5.2xlarge EC2 instance, while the previous steps finished in a timely fashion. Notably the CPU usage for the entire period is very low (less than 1%), while memory usage remains fairly constant at 40%, indicating some sort of CPU bottleneck.
The text was updated successfully, but these errors were encountered: