Fix ScaLAPACK not linked in MPI builds with meson#717
Open
jameskermode wants to merge 3 commits intopublicfrom
Open
Fix ScaLAPACK not linked in MPI builds with meson#717jameskermode wants to merge 3 commits intopublicfrom
jameskermode wants to merge 3 commits intopublicfrom
Conversation
Two issues caused gap_fit to fail with "LA_Matrix_Factorise: cannot factorise" when built with meson and MPI enabled: 1. gap_fit_lib was missing mpi_dep in its dependencies, so libgap_fit.so didn't record ScaLAPACK/MPI as needed libraries. On Linux where --as-needed is the default, the linker dropped ScaLAPACK from the final executable since gap_fit.o doesn't directly reference ScaLAPACK symbols. 2. ScaLAPACK detection had no find_library fallback, unlike OpenBLAS. Environments with broken or missing pkg-config files (conda, HPC modules) would silently produce wrong linker flags, and b_lundef=false masked the resulting undefined symbols. Fixes #715, fixes #716 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Mar 30, 2026
The old Makefile build enabled HAVE_QR by default (make config defaulted to 'y'). Without this flag, gap_fit uses the Cholesky factorisation code path (LA_Matrix_Factorise/dpotrf) instead of QR decomposition. The Cholesky path is both numerically less stable (requires positive definiteness) and lacks MPI/ScaLAPACK support, causing failures in both serial and parallel gap_fit runs. Fixes #715, fixes #716 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The missing -DHAVE_QR flag wasn't caught because: 1. Serial tests never checked which solve method was used 2. CI never built with MPI, so all ScaLAPACK tests were skipped Fix both gaps: - Add check_qr_path() assertion to all serial gap_fit tests that perform a fit, verifying "Using LAPACK to solve QR" appears in the log. This would have immediately caught the missing -DHAVE_QR. - Add a build-mpi CI job on ubuntu-latest that installs OpenMPI + ScaLAPACK, builds with -Dmpi=true, and runs the full test suite with HAVE_SCALAPACK=1, enabling the 6 existing ScaLAPACK tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
Author
|
@albapa do you want to check this before I merge? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mpi_deptogap_fit_libdependencies solibgap_fit.sorecords ScaLAPACK/MPI as needed libraries. On Linux where--as-neededis the default linker behaviour, these were being silently dropped.find_library('scalapack')fallback for ScaLAPACK detection, matching the existing OpenBLAS pattern. Environments with broken or missing pkg-config files (conda, HPC modules) now fall back to a direct library search.Fixes #715
Fixes #716
Test plan
meson setup builddir -Dmpi=true) configures and compiles successfullylibgap_fitlink command inbuild.ninjanow includes ScaLAPACK and MPI flagsldd builddir/src/Programs/gap_fit | grep scalapackshould show the library🤖 Generated with Claude Code