-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
conda version of hifiasm slower than manually built version #392
Comments
I have no idea about that... Could you please share the 10M dataset with us? I will have a try to reproduce this issue. |
Just sent you an email with a link to a 10Mb region of HG005 |
Thanks a lot! |
quay.io/biocontainers/hifiasm:0.18.5--h5b5514e_0 is 5-fold slower than the 0.15.2 version compiled outside the container, when they were used to assemble a 2.5 Gb plant genome |
@xzhoubayer Is it a public dataset? Can I reproduce this issue on my side? |
It is not a public dataset. I cannot share. I think there are some public data sets of corn genomes |
0.18.5 should not be slower than 0.15.2, right? |
Might be a conda issue, instead of hifiasm itself. |
The conda version of hifiasm is significantly slower than the manually built; see chhylp123/hifiasm#392.
The conda version of hifiasm is significantly slower than the manually built; see chhylp123/hifiasm#392.
Hi @gconcepcion, I'm wondering are previous conda versions of hifiasm also several times slower? |
I switched from building my own docker with hifiasm to using the conda hifiasm sometime between July and October 2020 (I think you added the bioconda recipe around this time), and I've been using the conda version regularly since then. Based on the ~500 human samples I've assembled with the conda build, my expectation has always been roughly 18 to 36 hours on 48 threads for 20-30x depth, and I haven't really seen that change. |
I personally never noticed the issue because I always manually build hifiasm and it's always very fast. @williamrowell and I discovered this issue recently because he has been telling me for the past two years how slow it was in his human-wgs pipeline (which was contrary to my experience) so I finally decided to dig in to figure out what was going on and realized there is some unknown discrepancy between conda and a manually built version. |
I ran a small test sample using the conda-based and a manually built version of hifiasm (version 0.15.5).
Using a manually built version of hifiasm:
Using the manually built version ran ~2.5x faster. |
I see. Thanks a lot. It should be the issue of conda receipt. I will fix it as soon as possible. |
How do you all compile hifiasm? The build script looks not too fancy to me: https://github.com/bioconda/bioconda-recipes/blob/master/recipes/hifiasm/build.sh and I'm wondering what could go wrong here. |
We're seeing this across a lot of different build environments, but we're all just running something like the basic instructions in @chhylp123's repo under Getting Started:
or something equivalent like the following from @hkeward's Dockerfile:
|
@bgruening The
Do you have any idea about this issue? |
This means zlib is not found and I guess this is because you are overwriting CXXFLAGS in your Makefile. I guess you can make it work by using something like |
Thanks a lot. If I understand correctly, writing bioconda receipt in this way will not overwrite CXXFLAGS within the Makefile: |
@chhylp123 have you seen this here: #402 Please feel free to open a bioconda PR than I can mess around with it. |
@gconcepcion @williamrowell can you maybe try version 0.18.8 from bioconda? |
Yes, I can confirm that 0.18.8 from bioconda now performs as expected:
Thanks for the fix everyone! |
Very cool, thanks for testing! |
Thanks all for the great help! But I still don't understand why my commit didn't work. Could you please explain more (see: #402 (comment))? Thank you in advance. |
@bgruening My question is that as this issue mentioned (weidai11/cryptopp#525), if users set CXXFLAGS by the command line, the GNU make will overwrite CXXFLAGS no matter it is hardcoded in Makefile or not? I guess the GNU make will overwrite CXXFLAGS in anyway? |
- older hifiasm versions _built by conda_ are slower than those same versions built outside of conda chhylp123/hifiasm#392 - this is a problem with the conda build process - fixed recently chhylp123/hifiasm#392
Hi Haoyu,
We've noticed internally that running a version of hifiasm compiled outside of conda with gcc/11.1.0 or gcc/11.3.0 runs significantly faster in terms of CPU time than a version of hifiasm installed / compiled using conda.
For a small 10Mb test dataset, the difference is roughly 3-4 times faster for the manually built version than with the conda built version.
hifiasm compiled w/ gcc/11.3.0 :
[M::main] Real time: 121.909 sec; CPU: 4144.494 sec; Peak RSS: 16.353 GB
hifiasm installed / compiled using conda:
[M::main] Real time: 393.322 sec; CPU: 15529.166 sec; Peak RSS: 16.353 GB
The time differential is even more significant when running a full human sized dataset.
Can you think of anything about the conda installed version that would result in a binary that takes longer to compute the assembly graph? Could this have something to do with compiler flags?
Thanks!
The text was updated successfully, but these errors were encountered: