-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapting to the system hwloc situation #73
Comments
I just hit this again. @matthiasdiener, what do you think of the detection solution I proposed? |
See illinois-ceesd/mirgecom#169 for another workaround. |
Hm. I just saw this again, but I realized that my system does not benefit from illinois-ceesd/mirgecom#169. Among other things, this breaks running the meshmode tests. I don't think it's feasible to apply this workaround universally. This makes me like the hwloc version picking hackery idea more. @matthiasdiener What do you think? |
Based on @majosm's experience, this has the potential to cause spurious breakage on Macs and Linux, except the Mac failures are even worse: all you get is a segfault, not even an error message you can search. I don't think I'd consider this fixed, even if the import order can be used to work around it. (@majosm, could you IMO, the emirge install script should make an effort to set things up so as to avoid this. The main wrinkle with the script snippet above is that we need some way of finding the hwloc shared library. |
With |
Thanks for checking! So it's probably the same bug, just different crashes. I don't think it's plausible that we'll catch all the code where things get installed in the "wrong" order. And I can't reasonably add a warning to PyOpenCL either... @majosm Does the above snippet (or some version of it) successfully detect the "ambient" (MPI) version of hwloc? |
Seems like it detects the conda hwloc. When I run: import ctypes
hwloc = ctypes.cdll.LoadLibrary("libhwloc.dylib")
# https://github.com/open-mpi/hwloc/blob/master/include/hwloc.h
hwloc.hwloc_get_api_version.restype = ctypes.c_uint
hwloc.hwloc_get_api_version.argtypes = []
print(hwloc.hwloc_get_api_version() >> 16) it prints |
But what does it do prior to any conda env being active? The whole point would be for it to detect what hwloc exists in the environment and adapt to that, to avoid conflicts. |
After activating conda and prior to
|
Huh. I'm guessing that's the system Python not allowing you to load dynamic libraries? That's annoying. |
I guess that's not a big issue: We could conceivably just run that code in the conda base environment (right after miniconda completes). FWIW, "2" is the answer we want in that case, right? |
Unfortunately not. 🙁 System hwloc is v1. Edit: It's actually spack hwloc. Let me try the last experiment again, I might not have had my spack packages loaded up when I tried. Edit 2: |
Despite the spack hwloc being v2? |
Spack hwloc is v1 (my MPI is installed via spack and uses that version). conda is v2. |
As another (easier?) workaround, could we default to installing |
AFAIK, we're not installing any MPI implementation ATM, and that seems like the right approach. (i.e. use whatever |
I mean, recommending people to install Edit: |
I think this issue only happens if |
I'm not sure we have reliable control over what gets loaded first though. |
It seems that we have two possible situations that can give us the annoying:
As of #71, we default to installing libhwloc 1, but that's also not a safe default. Is there something we can do to automate installing the correct libhwloc?
This snippet will get the current hwloc version, using either Python 2 or 3:
I'm just not sure that using this is a great idea...
cc @matthiasdiener
The text was updated successfully, but these errors were encountered: