Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to build GDASApp on Cactus following the system upgrade #3100

Open
RussTreadon-NOAA opened this issue Nov 14, 2024 · 12 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@RussTreadon-NOAA
Copy link
Contributor

What is wrong?

Attempts to build GDASApp develop on Cactus fail during the configuration step

What should have happened?

Before the system upgrade GDASApp develop built without error on Cactus. GDASApp develop still builds on Dogwood without error. Dogwood is schedule to be upgraded 11/19-21/2024.

What machines are impacted?

WCOSS2

What global-workflow hash are you using?

57c8aa3

Steps to reproduce

  1. make a work directory, $WRKDIR
  2. cd $WRKDIR
  3. git clone --recursive https://github.com/NOAA-EMC/global-workflow .
  4. cd sorc
  5. ./build_all.sh -uk
  6. buiild_all.sh will fail with a build_gdas error

Additional information

  1. Ticket #2024111410000051 has been submitted to the WCOSS2 helpdesk to report this problem.
  2. cloning the current head of GDASApp develop encounters the same build problem - see GDASApp issue #1376.

Do you have a proposed solution?

To see an example of the Steps to reproduce go to /lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/develop/sorc. sorc/build_all.sh -gk returns

Creating logs folder
Creating /lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/develop/exec folder
Resetting modules to system default. Reseting $MODULEPATH back to system default. All extra directories will be removed from $MODULEPATH.
Building ufs, gfs_utils, gdas, ww3prepost, ufs_utils, gsi_utils, gsi_monitor, upp
Starting build_ufs.sh
Starting build_gfs_utils.sh
Starting build_gdas.sh
Starting build_ww3prepost.sh
Starting build_ufs_utils.sh
Starting build_gsi_utils.sh
Starting build_gsi_monitor.sh
Starting build_upp.sh
build_gfs_utils.sh completed successfully!
build_gsi_monitor.sh completed successfully!
build_gdas.sh failed!  Exiting!
Check logs/build_gdas.log for details.

build_gdas.log ends with

-- ---------------------------------------------------------
-- Adding bundle project oops
-- ---------------------------------------------------------
-- [oops] (1.10.0)
-- Feature TESTS enabled
CMake Error at /apps/spack/cmake/3.20.2/intel/19.1.3.304/utnbptm3hrf7gppztidueu4jogfgemut/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find OpenMP_CXX (missing: OpenMP_CXX_FLAGS OpenMP_CXX_LIB_NAMES)
Call Stack (most recent call first):
  /apps/spack/cmake/3.20.2/intel/19.1.3.304/utnbptm3hrf7gppztidueu4jogfgemut/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
  /apps/spack/cmake/3.20.2/intel/19.1.3.304/utnbptm3hrf7gppztidueu4jogfgemut/share/cmake-3.20/Modules/FindOpenMP.cmake:542 (find_package_handle_standard_args)
  oops/CMakeLists.txt:56 (find_package)


-- Configuring incomplete, errors occurred!
See also "/lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/develop/sorc/gdas.cd/build/CMakeFiles/CMakeOutput.log".
See also "/lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/develop/sorc/gdas.cd/build/CMakeFiles/CMakeError.log".
@RussTreadon-NOAA RussTreadon-NOAA added bug Something isn't working triage Issues that are triage labels Nov 14, 2024
@RussTreadon-NOAA
Copy link
Contributor Author

Attention @aerorahul , @WalterKolczynski-NOAA , and @KateFriedman-NOAA .

If you want to build g-w on Cactus do not include GDASApp in the build. build_all.sh -uk fails.

@AndrewEichmann-NOAA
Copy link
Contributor

AndrewEichmann-NOAA commented Nov 15, 2024

@RussTreadon-NOAA Apologies if this confuses things, but I successfully built and semi-succesfully ran gw-ci yesterday with freshly pulled and submodule updated develop/develop yesterday on Cactus. Earlier I think I did have problems, so I maybe stumbled on some secret sauce

@RussTreadon-NOAA
Copy link
Contributor Author

This is a useful data point @AndrewEichmann-NOAA . Where may I find you install and build on Cactus?

@AndrewEichmann-NOAA
Copy link
Contributor

This is a useful data point @AndrewEichmann-NOAA . Where may I find you install and build on Cactus?

Cactus has switched to production. I'll try to get an instance of GDASApp running before the upgrade on Dogwood.

@RussTreadon-NOAA
Copy link
Contributor Author

I'm currently running g-w DA CI on Dogwood for PR #2992 with GDASApp feature/resume_nightly for sorc/gdas.cd. The build successfully completed. I expect some jobs to fail based on past experience. Certain GDASApp executables need to be built with spack-stack.

@AndrewEichmann-NOAA
Copy link
Contributor

I built on Dogwood and it failed on the bmat task, same as before.

@RussTreadon-NOAA
Copy link
Contributor Author

Yes, same failure for me. Building GDASApp with spack-stack get us pass this failure. Aerosol and Snow DA also fail with the hpc-stack build but run with spack-stack. WCOSS2 spack-stack is not yet ready for general use. It's still being tested and evaluated.

@AndrewEichmann-NOAA
Copy link
Contributor

@RussTreadon-NOAA It seems develop will be broken everywhere until some PRs get merged. Is there anything I can pick up with this for now?

@RussTreadon-NOAA
Copy link
Contributor Author

@RussTreadon-NOAA It seems develop will be broken everywhere until some PRs get merged. Is there anything I can pick up with this for now?

We need to wait for the official installation of spack-stack on WCOSS2. Given issues with the ongoing upgrade I don't think spack-stack installation on WCOSS2 will happen soon.

@RussTreadon-NOAA
Copy link
Contributor Author

We use hpc-stack to build GDASApp on WCOSS2. Replace hpc-stack with spack-stack/1.6.0. Cactus build of GDASApp develop using spack-stack/1.6.0 runs to completion with all executables being created. Unable to run executables because Cactus is currently the production machine.

WCOSS2 Ticket #2024111410000051 updated with this information.

@WalterKolczynski-NOAA WalterKolczynski-NOAA removed the triage Issues that are triage label Nov 19, 2024
@AndrewEichmann-NOAA AndrewEichmann-NOAA self-assigned this Dec 10, 2024
@RussTreadon-NOAA
Copy link
Contributor Author

Given the reminder from @WalterKolczynski-NOAA in g-w PR #3186, I confirmed that GDASApp does NOT build on Dogwood following the system upgrade. Whether we build from the command line or via a batch script, the build fails with

-- ---------------------------------------------------------
-- Adding bundle project oops
-- ---------------------------------------------------------
-- [oops] (1.10.0)
-- Feature TESTS enabled
CMake Error at /apps/spack/cmake/3.20.2/intel/19.1.3.304/utnbptm3hrf7gppztidueu4jogfgemut/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find OpenMP_CXX (missing: OpenMP_CXX_FLAGS OpenMP_CXX_LIB_NAMES)
Call Stack (most recent call first):
  /apps/spack/cmake/3.20.2/intel/19.1.3.304/utnbptm3hrf7gppztidueu4jogfgemut/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
  /apps/spack/cmake/3.20.2/intel/19.1.3.304/utnbptm3hrf7gppztidueu4jogfgemut/share/cmake-3.20/Modules/FindOpenMP.cmake:542 (find_package_handle_standard_args)
  oops/CMakeLists.txt:56 (find_package)

Updating sorc/gdas.cd/moduelfiles/GDAS/wcoss2.intel.lua to use /apps/ops/test/spack-stack-1.6.0-nco/envs/nco-intel-19.1.3.304/ allows the GDASApp build to successfully complete on Dogwood.

@dkokron
Copy link

dkokron commented Dec 27, 2024

The issue with not finding OpenMP_CXX is very likely due to using the wrong compiler. After the upgrade, the default craype changed to 2.7.31. That PE points the "cc" and "CC" compiler wrappers to the LLVM based Intel compilers from release 2021.1 (2020.8.0.0827). The previous PEs point to revision 19.1.3.304 20200925 of the 'classic' compilers. A workaround is to specify a PE version in gdas.cd/modulefiles/GDAS/wcoss2.intel.lua

from
load("craype")
to
load("craype/2.7.17")

Tested on dogwood.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants