Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when try to setBackend from MsBackendMzR to MsBackendDataFrame #4

Open
gmhhope opened this issue Apr 26, 2021 · 3 comments
Open

Comments

@gmhhope
Copy link

gmhhope commented Apr 26, 2021

Hi Johannes,

Here also I want to report issue when I tried to load my own data and setBackend from MzR to DataFrame. It renders following error.

#' Change backend to a MsBackendDataFrame: load data into memory
sps_all <- setBackend(sps_all, MsBackendDataFrame())

Error messages:

3 parallel jobs did not deliver results1 parallel job did not deliver a resultError in result[[njob]] <- value : 
  attempt to select less than one element in OneIndex

I am not sure if the size matter. I have 19 DDA runs and the total size using MzR backend is as followed:

print(object.size(sps_all), units = "MB")
13.5 Mb

Thanks,
Minghao Gong

@jorainer
Copy link
Owner

Are you running the code in the docker? If possible I would suggest to run this in a normal R - which will avoid any problems related to file paths etc that will be different between the docker and your local file system. Also, depending on your docker configuration, it may happen that your docker process runs out of memory. If you want to debug this error, it may help if you call register(SerialParam()) before the call above. That way you disable parallel processing and the error message might be a little more self-explanatory. The error message you got is from BiocParallel - and is not really helpful. It just means that something went wrong - could be out-of-memory or any other error.

Also, if you have many (large) files it might be better to keep everything on-disk instead of in-memory - performance should still be OK, so there is no actual need to load the data into memory by using an MsBackendDataFrame.

@gmhhope
Copy link
Author

gmhhope commented Apr 26, 2021

Are you running the code in the docker? If possible I would suggest to run this in a normal R - which will avoid any problems related to file paths etc that will be different between the docker and your local file system. Also, depending on your docker configuration, it may happen that your docker process runs out of memory. If you want to debug this error, it may help if you call register(SerialParam()) before the call above. That way you disable parallel processing and the error message might be a little more self-explanatory. The error message you got is from BiocParallel - and is not really helpful. It just means that something went wrong - could be out-of-memory or any other error.

Also, if you have many (large) files it might be better to keep everything on-disk instead of in-memory - performance should still be OK, so there is no actual need to load the data into memory by using an MsBackendDataFrame.

I am quite new to docker so I will learn from the hints you provide. But I do feel the benefit to use docker to run the process so that I don't need the manual configuration.

So I think it will be still great if docker can be run. If not, I will try to reconfigure the spectra in my local R.

Also, if you have many (large) files it might be better to keep everything on-disk instead of in-memory - performance should` still be OK, so there is no actual need to load the data into memory by using an MsBackendDataFrame.

Thank you for the QA!

Best,
Minghao

@jorainer
Copy link
Owner

Yes, docker is very nice. Especially on linux the performance is about the same as if you would run R natively. On Windows and macOS it is different because docker uses a virtual machine, so in essence, docker runs in a linux within a virtual machine on these operating systems. That's why on Windows and macOS docker is not ideal. Note that on those OS you can also specify the memory and number of CPUs etc that docker (or actually the virtual machine) will be able to use. You may increase these if you experience memory issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants