BUG: Re-evaluate `BatchSimulate` backends #955

asoplata · 2024-11-26T21:17:46Z

Update:

It turns out that BatchSimulate does not currently work in its entirety, for most of its listed available backends. This situation needs to be addressed before we improve BatchSimulate's testing (the original point of this issue). On my machine, if I execute our BatchSimulate example ( https://github.com/jonescompneurolab/hnn-core/blob/master/examples/howto/plot_batch_simulate.py ) but change the backends, I get the following:

backend='loky': Succeeds. The default, and the only backend that works out-of-the-box.
backend='dask': Partially succeeds, but needs considerable code changes in order to do so (which are currently undocumented). The usage in this PR ( https://github.com/jonescompneurolab/hnn-core/pull/1009/files#diff-257f4367bd17cccc84c603209300a970cc3f42f629fb2e3048e7d88e4afaf6fbR357-R384 ) works for me, but:
1. it requires a pip install "dask[distributed]" because we have no setup.py recipe for installing Dask,
2. it requires a different usage of the API (calling different functions) than in our Example script, and
3. it requires wrapping the actual simulation call with a setup Dask.Client object.
backend='multiprocessing': Fails. I get an endless series of RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase and it does not appear to start the simulations at all. You can easily test this one by running ( https://github.com/jonescompneurolab/hnn-core/blob/master/examples/howto/plot_batch_simulate.py ) and merely changing the backend.
backend='threading': Fails. At least one simulation is successfully begun, and I'm able to see some Trial 1: 30.0 ms... etc. output, but the program quickly crashes with a segfault. I'm unfamiliar with the threading module of the standard-library, including whether it depends on hardware-threading, which I would assume it does not. I think use of this library should still be usable on my Apple M4 chip (which does not support simultaneous multi-threading aka hardware-threading, as opposed to software-threading), but I could be wrong about that.

In the meantime, it may be prudent to change the documentation for BatchSimulate such that it indicates that only loky is currently fully supported. We could then re-add documentation about the other backends in future versions once we know they work and test. This is relevant for our public API, which BatchSimulate also needs to be added to, which it wasn't when it was added: it is listed in our Examples ( https://jonescompneurolab.github.io/hnn-core/stable/auto_examples/howto/plot_batch_simulate.html#sphx-glr-auto-examples-howto-plot-batch-simulate-py ) but currently NOT listed in our public API ( https://jonescompneurolab.github.io/hnn-core/stable/api.html ).

Original Issue follows:

Currently, testing of BatchSimulate is very incomplete.

Importantly, as @ntolley pointed out here ENH: Refactor BatchSimulate Example and Improve Documentation #857 (comment) , we need a way to test that the loky backend not only creates multiple workers, but actually uses the multiple parallel workers, in parallel. I think we may have to actually use loky directly in our tests for this (see ENH: Refactor BatchSimulate Example and Improve Documentation #857 (comment) ).
The threading backend only has a single test, the multiprocessing and dask backends are not tested at all.
Much of the testing is parameter TypeError testing, which is not testing BatchSimulate's functionality so much as it is testing _validate_type and _check_option. I hate to say it, but I think many of these tests can be removed. Relatedly, I will make an issue for introducing typing and type-checking for the codebase as a whole. I'm fine with leaving these tests in until we get more formal type-checking implemented, however.
Relatedly, we should consider dropping the 'multiprocessing' backend as an option, since it has quirks in Joblib and, judging from joblib's documentation, is effectively replaced by loky (see https://joblib.readthedocs.io/en/stable/parallel.html#old-multiprocessing-backend ).

The text was updated successfully, but these errors were encountered:

samadpls · 2024-11-26T21:33:14Z

Hi @asoplata , I agree with everything you've mentioned regarding the testing and using loky. I didn't add the Dask backend for testing, as this repo doesn't currently include it as a dependency, but if needed, it can be installed separately

asoplata · 2024-11-26T21:34:43Z

It's arguably hacky, but we could also try using psutil to track processes with specific names in our testing, ensuring that the appropriate number are created, etc.

asoplata · 2024-11-26T21:36:22Z

Hi @asoplata , I agree with everything you've mentioned regarding the testing and using loky. I didn't add the Dask backend for testing, as this repo doesn't currently include it as a dependency, but if needed, it can be installed separately

Thanks @samadpls , yes most of these BatchSimulation test refactors are TODO tasks that existed before your PR. We definitely appreciate your contributions!

samadpls · 2024-11-26T21:41:37Z

Thank you for the clarification! I will ensure it is useful and maintainable for future development. If you need any changes, please let me know. I'm happy to improve the structure and functionality.

Freedisch · 2025-03-05T05:24:51Z

This sound like a good issue to start with, I'll look into

asoplata · 2025-03-14T20:37:10Z

@ntolley @dylansdaniels I've updated the body of this issue to reflect that there are currently undocumented issues and bugs with the BatchSimulate backends. We should discuss this on Monday, including changing the current BatchSimulate documentation such that, for the time being, only the loky backend is mentioned, since it's the only one that works out of the box and is tested.

asoplata added parallelization Parallel processing testing labels Nov 26, 2024

asoplata added this to the 0.5 milestone Nov 26, 2024

asoplata mentioned this issue Nov 26, 2024

ENH: Refactor BatchSimulate Example and Improve Documentation #857

Merged

github-actions bot mentioned this issue Dec 1, 2024

Monthly issue metrics report #958

Closed

asoplata added the good first issue Good for newcomers label Jan 14, 2025

Freedisch linked a pull request Mar 12, 2025 that will close this issue

[WIP]: test: improve BatchSimulate test cases #1009

Open

asoplata removed the good first issue Good for newcomers label Mar 14, 2025

asoplata changed the title ~~TEST: Improve BatchSimulate testing~~ BUG: Re-evaluate BatchSimulate backends Mar 14, 2025

asoplata mentioned this issue Mar 18, 2025

[MRG]: Update and include BatchSimulate documentation #1019

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Re-evaluate `BatchSimulate` backends #955

BUG: Re-evaluate `BatchSimulate` backends #955

asoplata commented Nov 26, 2024 •

edited

Loading

samadpls commented Nov 26, 2024

asoplata commented Nov 26, 2024

asoplata commented Nov 26, 2024

samadpls commented Nov 26, 2024

Freedisch commented Mar 5, 2025 •

edited

Loading

asoplata commented Mar 14, 2025 •

edited

Loading

BUG: Re-evaluate BatchSimulate backends #955

BUG: Re-evaluate BatchSimulate backends #955

Comments

asoplata commented Nov 26, 2024 • edited Loading

samadpls commented Nov 26, 2024

asoplata commented Nov 26, 2024

asoplata commented Nov 26, 2024

samadpls commented Nov 26, 2024

Freedisch commented Mar 5, 2025 • edited Loading

asoplata commented Mar 14, 2025 • edited Loading

BUG: Re-evaluate `BatchSimulate` backends #955

BUG: Re-evaluate `BatchSimulate` backends #955

asoplata commented Nov 26, 2024 •

edited

Loading

Freedisch commented Mar 5, 2025 •

edited

Loading

asoplata commented Mar 14, 2025 •

edited

Loading