Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new build system #724

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

VincentVanlaer
Copy link
Collaborator

Main commit message

Reasons for replacing the previous build system were:

  • the previous build system had a lot of duplication between the modules, both in terms of make files and shell scripts.
  • parallelisation was limited. The new build system parallelises better within a module, and between modules.
  • build scripts and output from the build ended up in the same folder. With the new system, all build output files are stored in the folder build in the repository root, making cleaning build files equivalent to deleting this folder

For the details of the implementation, see the newly added documentation (Developing -> Build system).

Some further notable changes:

  • makedepf90 has been replaced by a perl script that scans the fortran source files. This allows for better flexibility in construction the make files.
  • Currently, pymesa does not work with the new build system, as the new system does not provide a way to build shared libraries for every module.
  • Dependencies are being discovered using pkg-config. This change makes it easier to build MESA without the SDK, as linux distributions typically ship with pkg-config files.

Draft status

Draft status of this PR will be removed once a decision has been mode about #712 and #723, after which I will rebase.

Reviewing this PR

I recommend to first take a test drive of the new system, especially if you use some special setup (e.g. a non-sdk build). Install instructions haven't changed. I have written relatively detailed documentation on how the new system works, which I recommend reviewers of this PR to read before diving into the Makefiles and scripts. Let me know if something is unclear or missing in the documentation! The first few commits are just #712 and #723, so those can be skipped (if #712 and #723 don't end up being merged, it is an easy change to add the files back). The other commits are truly part of this PR, with the first two introducing the new build system. The other commits are porting of individual modules.

Remarks

  • About half of the modules have been ported.
  • There is a lot of output on each run for installed files to $MESA_DIR/lib and $MESA_DIR/include. Once all modules have been ported, this will no longer be needed for a normal install, except for the data files. Internally, the different modules that have been ported don't use those files.
  • I generate version numbers using git describe --tags which generates (in my opinion) nicer version numbers than just using the commit hash. However, this requires the version tags to be on the main branch (currently running this command on main gives r15140-2084-ge29ed74d rather than something like r24.08.1-NNNN-ge29ed74d).
  • It would be beneficial to add GitHub actions for the following cases
    • Building without SDK
    • Building all helper programs which are not built by default
  • As mentioned in the main commit message, pymesa currently doesn't work. This is because I think it would make more sense to build one single shared library after the main build as a separate step, rather than a bunch of little libraries which need to be able to refer to each other. I intend to do this anyway for the output of the star/binary module, such that building the actually star binary can proceed faster, and allow for things such as experimenting with link-time optimization.

Future iterations

I have thought about some future iterations of the build system after this, but I have not implemented those things since otherwise I would keep iterating forever on this build system trying to design the perfect build system. Hence, writing this down to at least have it documented somewhere.

  1. Use a single make invocation to build and test all modules. Currently, each module is built in a separate make invocation, which limits parallelisation and true dependencies between modules (once the mod files have been generated, the dependent MESA modules could start building). Furthermore, if you change something in an include file which is included in another module, make might not immediately pick it up. Combining all modules into one make invocation would prevent such surprises in the development. In order to not cause excessive rebuilds, it would be necessary to use Fortran submodules, such that no public modules are dependent on their implementation. All of this requires a full set of generated make files, as to not overlap the variables and targets used by the different modules.
  2. With such a system in place, it may be beneficial to replace the generated make files by ninja. The main issue with the system from 1. (and in fact the current system as well) this tries to solve, is that if you remove an include file, make will error out. This is because the include file is in the dependencies of the module. From what I understand, by using dynamic dependencies in ninja, it will not error out in such a case. However, this very much untested. In general, I think it is possible to make a build system in this way that exactly compiles what is needed, and doesn't error if cleaning and building wouldn't cause an error as well. Maybe there are easier ways to fix that in the end than using a different command runner.

I have probably forgotten some things I wanted to mention, but this should be most of it.

@VincentVanlaer VincentVanlaer force-pushed the build-system branch 2 times, most recently from c3d879d to 3369f7a Compare August 26, 2024 17:15
@pmocz pmocz self-requested a review August 26, 2024 18:06
@warrickball
Copy link
Contributor

I've just started chipping away at this and started off with the error

mkdir -p build
make/gen-folder-deps "make" const utils math mtx auto_diff num interp_1d interp_2d chem eos forum > build/depend
fatal: No names found, cannot describe anything.
make: *** [make/version.mk:2: data/version_number] Error 128
make: *** Deleting file 'data/version_number'

for both make and ./install, which emanates from git tag. I couldn't for the life of me fetch any tags so I just tagged the HEAD commit on your build-system branch and that got the build going.

I'll see how far I get with the SDK (the build feels faster) and then give it a try with my system libraries, which includes GCC 14.2.1, and then the ifort build on the local cluster. It'll take me at least 2-3 weeks to work through but is my MESA priority.

@VincentVanlaer VincentVanlaer force-pushed the build-system branch 2 times, most recently from 7465ce1 to c483a59 Compare October 21, 2024 21:58
@VincentVanlaer
Copy link
Collaborator Author

I've just started chipping away at this and started off with the error

mkdir -p build
make/gen-folder-deps "make" const utils math mtx auto_diff num interp_1d interp_2d chem eos forum > build/depend
fatal: No names found, cannot describe anything.
make: *** [make/version.mk:2: data/version_number] Error 128
make: *** Deleting file 'data/version_number'

for both make and ./install, which emanates from git tag. I couldn't for the life of me fetch any tags so I just tagged the HEAD commit on your build-system branch and that got the build going.

I noticed this was an issue with CI, but worked locally. I think at some point tags for the releases got moved and renamed, and the main branch therefore contains no tags when you do a clean checkout. It should be fixed now (still have to cleanup the mess I made of the commits for testing)

This commit introduces a new build system (still based on make files).
Reasons for replacing the previous build system were:

- the previous build system had a lot of duplication between the
  modules, both in terms of make files and shell scripts.
- parallelisation was limited. The new build system parallelises better
  within a module, and between modules.
- build scripts and output from the build ended up in the same folder.
  With the new system, all build output files are stored in the folder
  ``build`` in the repository root, making cleaning build files
  equivalent to deleting this folder

For the details of the implementation, see the newly added documentation
(Developing -> Build system).

Some further notable changes:

- makedepf90 has been replaced by a perl script that scans the fortran
  source files. This allows for better flexibility in construction the
  make files.
- Currently, `pymesa` does not work with the new build system, as the
  new system does not provide a way to build shared libraries for every
  module.
- Dependencies are being discovered using pkg-config. This change
  makes it easier to build MESA without the SDK, as linux distributions
  typically ship with pkg-config files.
This commit adds the necessary pkg-config files until the SDK ships
them.
@pmocz
Copy link
Member

pmocz commented Oct 22, 2024

I've just started chipping away at this and started off with the error

mkdir -p build
make/gen-folder-deps "make" const utils math mtx auto_diff num interp_1d interp_2d chem eos forum > build/depend
fatal: No names found, cannot describe anything.
make: *** [make/version.mk:2: data/version_number] Error 128
make: *** Deleting file 'data/version_number'

for both make and ./install, which emanates from git tag. I couldn't for the life of me fetch any tags so I just tagged the HEAD commit on your build-system branch and that got the build going.

I noticed this was an issue with CI, but worked locally. I think at some point tags for the releases got moved and renamed, and the main branch therefore contains no tags when you do a clean checkout. It should be fixed now (still have to cleanup the mess I made of the commits for testing)

I apologize, I should have recognized that error above was due to missing tags. Yes, we unfortunately had to rename tags and move them around to tips of branches get readthedocs to auto-detect the versions, render the right docs, and create banners. Sounds like this caused a headache, but I am glad it is sorted out. At some point I'd like to go back to branch-based docs builds but that will require renaming branches to follow PEP440 standards to get readthedocs to auto-detect versions

@pmocz
Copy link
Member

pmocz commented Oct 22, 2024

Looking good! I am able to install without problems. What is the status and plan for this PR, @VincentVanlaer ?

@VincentVanlaer
Copy link
Collaborator Author

This can be merged as is in principle, since it builds. However, I would not yet merge this to main (maybe to a separate branch), but first:

  • port the remaining modules (I am happy to do that, but if you want have a go at it, please do so 😃 )
  • fix the pymesa issue (this will be easier once all of the modules have been ported)
  • ask for some volunteers on the users list to test on their machines to see if something breaks

I would like to have a review of the new documentation and the system in general. What are show stoppers, things that should be improved, ...

@warrickball
Copy link
Contributor

I'm hopefully getting back on top of enough other things that I've had a look back at some MESA things, and this is my MESA priority. I just checked out out your mesa fork afresh, switched to the build-system branch and ran ./install as usual, which finished successfully. I was then able to run a work folder. Nice! 🎉

For anyone else who tries all this, note that a top-level make doesn't build everything (yet). You still need to run ./install.sh as before.

I'd like to push to get the whole of MESA under the new build before we merge. I don't think it makes sense to have some modules build one way and other modules build another.

The first thing I want to try is to update another module, to see how that process works. I'm not sure if the modules have to be updated in order because of their dependencies but looks like colors is next in install.sh and is a reasonable place to start anyway because it's small. I plan to mimic how chem appears to have been ported (which is mostly deleting scripts 👌).

A lower-priority goal is to see how easy it is to switch away from the SDK as the build backend but I don't think that has to be in this PR.

@VincentVanlaer
Copy link
Collaborator Author

I'm hopefully getting back on top of enough other things that I've had a look back at some MESA things, and this is my MESA priority. I just checked out out your mesa fork afresh, switched to the build-system branch and ran ./install as usual, which finished successfully. I was then able to run a work folder. Nice! 🎉

Awesome 😄

For anyone else who tries all this, note that a top-level make doesn't build everything (yet). You still need to run ./install.sh as before.

I'd like to push to get the whole of MESA under the new build before we merge. I don't think it makes sense to have some modules build one way and other modules build another.

👍

The first thing I want to try is to update another module, to see how that process works. I'm not sure if the modules have to be updated in order because of their dependencies but looks like colors is next in install.sh and is a reasonable place to start anyway because it's small. I plan to mimic how chem appears to have been ported (which is mostly deleting scripts 👌).

Yes, chem should be very similar.

A lower-priority goal is to see how easy it is to switch away from the SDK as the build backend but I don't think that has to be in this PR.

I am doing my local development without the SDK. There are some dependencies which are not readily available in my package manager, but otherwise it works well.

I have also been messing around with the gyre and kap modules, since they are more complicated.

@warrickball
Copy link
Contributor

warrickball commented Nov 13, 2024

I just pulled your branch again and now seem to be hitting an issue in the GYRE build when installing (with ./install or make):

...
make[1]: Leaving directory '/home/wball/mesa/vv-build-system/colors'
make -C gyre BUILD_DIR=build
make[1]: Entering directory '/home/wball/mesa/vv-build-system/gyre'
Package lapack95 was not found in the pkg-config search path.
Perhaps you should add the directory containing `lapack95.pc'
to the PKG_CONFIG_PATH environment variable
Package 'lapack95', required by 'virtual:world', not found
../make/compile-settings-gnu.mk:38: PKG_CONFIG_PATH is /home/wball/mesa/vv-build-system/build/kap/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/gyre/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/colors/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/forum/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/eos/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/chem/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/interp_2d/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/interp_1d/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/num/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/auto_diff/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/mtx/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/math/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/utils/lib/pkgconfig:/home/wball/mesa/vv-build-system/build/const/lib/pkgconfig:../make/pkgconfig:
../make/compile-settings-gnu.mk:38: *** pkg-config failed to find some of hdf5_fortran lapack95 lapack crlibm-fortran mesa-forum mesa-const, check PKG_CONFIG_PATH is correct.  Stop.
make[1]: Leaving directory '/home/wball/mesa/vv-build-system/gyre'
make: *** [build/depend:98: gyre] Error 2

I'm building with MESA SDK 24.7.1 on Fedora 40. I can look into this when I next get a chance, which might be over the weekend. (I compiled on one thread to isolate the error properly.)

@Debraheem
Copy link
Member

🚀

@VincentVanlaer
Copy link
Collaborator Author

I just pulled your branch again and now seem to be hitting an issue in the GYRE build when installing (with ./install or make):

Ah right, that's because I forgot to add a pkg-config file to make/pkgconfig for lapack95 (fortran bindings for lapack) which gyre requires.

@pmocz
Copy link
Member

pmocz commented Nov 13, 2024

For building on Mac, realpath does not support the -s option (in helpers.mk). I can do a brew install coreutils and use grealpath instead. But I am wondering if there is a simpler solution that introduces less dependency / more portable. Maybe we need to write our own shell function?

@VincentVanlaer
Copy link
Collaborator Author

For building on Mac, realpath does not support the -s option (in helpers.mk). I can do a brew install coreutils and use grealpath instead. But I am wondering if there is a simpler solution that introduces less dependency / more portable. Maybe we need to write our own shell function?

It is worth noting that this transformation is not required, it just makes the resulting path a bit cleaner to look at in log messages and such.

@warrickball
Copy link
Contributor

warrickball commented Nov 14, 2024

I now seem to quite consistently run into this error when I run ./install:

...
LIB_TOOL libstar.a
MAKEDEPF90
TEST_COMPILE ../src/run_star_extras.f90
TEST_COMPILE ../../../star/job/run_star.f90
TEST_COMPILE ../src/run.f90
make: *** No rule to make target '../../../lib/libgyre_mesa.a', needed by '../star'.  Stop.

FAILED
...

@VincentVanlaer
Copy link
Collaborator Author

I merged the mesa frontend for GYRE with GYRE itself into a single static library. This also seems to depend on features of ar that are not consistent across linux distros (let alone with another OS). I will probably revert that change and just allow for multiple outputted libraries.

@rhdtownsend
Copy link
Contributor

rhdtownsend commented Nov 14, 2024 via email

@VincentVanlaer
Copy link
Collaborator Author

If you’re doing this sort of thing, be careful that you don’t end up including libforum into whatever libraries you create. There’s some reference counting that goes on in libforum, relating to the HDF5 library, and this breaks if there are multiple copies of libforum that end up in the final executable.

Thanks for the tip! I will double check that, but I think it should be fine. What I intend to do is the same thing MESA does currently, namely building GYRE with FORUM=no and then copying over libgyre_mesa.a and libgyre.a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants