-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fefgGet (line 224) running out of memory #94
Comments
@soichih is this inside a docker container? |
what is the size of 'val'? and that of fg.fibers? Line 224 in c6d514a
|
Hi @soichih |
We are running inside singularity, and singularity itself doesn't impose any memory limit. If we runt it on karst, it should have access to all 32g available. For app-life, however, we only tell the #PBS that we use 24gb max. so the scheduler will kill the job if are using more than that, but the error messsage we are seeing is from matlab running out of memory. |
@soichih I just realized this is an mrtrix3 file. We never tested those files in matlab. Is it the case that is read correctly? There can be something in the header that makes them incompatible with fgRead. I would dlook into fg.fibers and see how the fibers look and the size. |
Hi @francopestilli , do you have any update on this issue by any chance? I was trying to run LiFE on a .tck tractogram with 5M streamlines to test the new version ( |
@daducci let's me look into this. Which branch of the repo are using? master? |
Yes, master branch in this repo (https://github.com/brain-life/encode). Then we used the |
Hi @daducci Take a look at this code: https://github.com/brain-life/encode/blob/master/life/fe/feConnectomeEncoding.m#L29 and: https://github.com/brain-life/encode/blob/master/life/fe/feConnectomeEncoding.m#L58 We provide some educated-guesses for the size of the memory that can be allocated and for the size fo the batches used to process the streamlines as input. If the guess is not appropriate for your OS and/or the number of streamlines you are using (much bigger than we normally use), the initialization is likely to fail. You should be able to change the batch size so that each batch can fit in memory. Does it make sense? |
Thanks @francopestilli for the clarification. |
Thank you both, I really appreciate your help! Indeed, the fitting does not fail, it simply requires about 100GB of RAM and 55h/brain to complete (probably because of swapping I suspect). But then the coefficients look fine. So, if I understood correctly, you suggest reducing the size of the batches so that the amount of memory is reduced, am I right? |
Hi @daducci correct, it should. But the RAM you measure worry me a little. We never see such large memory usage. Maybe your streamlines are sampled at very high resolution also (we use 1mm node spacing)? Anyways, reducing the batch size should work, I am just confused by the 100GB! If things are still so crazy (55h), would you feel comfortable sharing the dataset for a local test on our side (dwi+BVECS?BVALS and .tck file)? |
Hi Franco and Cesar, of course, there's no problem sharing the data! We are just running a few more tests to provide you with correct timings and memory requirements we observe: we fit LIFE to all 3 shells of the HCP dataset, but then saw that in your paper you only use the b=2000 shell, so we are performing this test and will come back to you asap. We also take care of the b0=5 issue. Meanwhile, we have run some other experiments on a smaller tractogram to match the one in your paper (500k streamlines, iFOD2 algo, default step size 0.625mm, file size 770MB). The aim was to play with the
As you can see, fitting time and memory requirements do not change, only the construction time does. What are we doing wrong? |
Hi @daducci, |
Dear Cesar, the test on the 4M streamlines and using only the b=2000 shell just finished, here are the outcomes: HCP subject 119833 Do these numbers make sense? If so, that's ok, we are just worried we are doing something wrong. Here is the link to the data in case you want to give it a try. |
Dear Alessandro, |
hi @daducci @ccaiafa I have been testing the code with the data Alessandro provided. The code seems to work, but:
@daducci can you please try if this version of the code runs for you now? Thanks for the feedback the more people use the code, the better it gets. |
We've been seeing a lot of out-of-memory issue with encode lately.
This might be happening because we started feeding track.tck output from mrtrix3 act app, which output somewhat more fibers and track.tck file size is slightly larger (1.69GB)
I will benchmark / profile the memory usage, but if @ccaiafa @francopestilli have any hunch as to what might be causing this issue, please let me know!
The text was updated successfully, but these errors were encountered: