Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add marine hybrid envar #3041

Merged

Conversation

guillaumevernieres
Copy link
Contributor

@guillaumevernieres guillaumevernieres commented Oct 29, 2024

Description

What the title says.
Main features added:

  • A new possible ci test that runs 1.5 cycle of the hybrid envar with the coupled UFS
  • yamls to allow running the hybrid envar GFSv17 prototype at c384/0.25 for the det and C192/0.25 for the ens. members
  • a few bug and dependency fixes to allow cycling with an ensemble
  • an option to turn off the direct insertion of the sea-ice ensemble member analysis/recentering

Issues addressed:

Dependencies:

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? YES :
    • GDAS NOT SUBMITTED YET
  • Other changes:
    • **requires to stage the low-res ens. IC's ** Issue not submitted yet

How has this been tested?

  • Tested a subset of the global-worflow ci on Hercules and Hera at various stages of the development.
  • Tested the hybrid ens. var. at C48/5.00 and C384/0.25

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary

ci/cases/pr/C48mx500_hybAOWCDA.yaml Outdated Show resolved Hide resolved
ci/cases/pr/C48mx500_hybAOWCDA.yaml Outdated Show resolved Hide resolved
Co-authored-by: AndrewEichmann-NOAA <58948505+AndrewEichmann-NOAA@users.noreply.github.com>
@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 (Cactus) C48mx500_3DVarAOWCDA g-w CI

Install guillaumevernieres:feature/marineenvar at 5c1148a on Cactus using GDASApp feature/marineenvar at 71e9e5f for sorc/gdas.cd. Use test version of spack-stack/1.60 to build sorc/gdas.cd

Prior to launching C48mx500_3DVarAOWCDA g-w CI make the following change in working copy of env/WCOSS2.env

@@ -109,17 +109,17 @@ elif [[ "${step}" = "marinebmat" ]]; then
     export APRUNCFP="${launcher} -n \$ncmd --multi-prog"
     export APRUN_MARINEBMAT="${APRUN_default}"
 
-elif [[ "${step}" = "ocnanalrun" ]]; then
+elif [[ "${step}" = "marineanlvar" ]]; then
 
     export APRUNCFP="${launcher} -n \$ncmd --multi-prog"
 
-    export APRUN_OCNANAL="${APRUN_default}"
+    export APRUN_MARINEANLVAR="${APRUN_default}"
 
-elif [[ "${step}" = "ocnanalchkpt" ]]; then
+elif [[ "${step}" = "marineanlchkpt" ]]; then
 
     export APRUNCFP="${launcher} -n \$ncmd --multi-prog"
 
-    export APRUN_OCNANAL="${APRUN_default}"
+    export APRUN_MARINEANLCHKPT="${APRUN_default}"
 
 elif [[ "${step}" = "ocnanalecen" ]]; then

With these changes in place, successfully run C48mx500_3DVarAOWCDA g-w CI

/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/prwcda_pr3041
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103241200        Done    Oct 31 2024 16:30:16    Oct 31 2024 16:50:22
202103241800        Done    Oct 31 2024 16:30:16    Oct 31 2024 19:23:12

The following marine DA jobs successfully ran

202103241800       gdas_prepoceanobs                   159911150           SUCCEEDED                   0         1         182.0
202103241800      gdas_marineanlinit                   159919020           SUCCEEDED                   0         1          25.0
202103241800         gdas_marinebmat                   159918814           SUCCEEDED                   0         1          38.0
202103241800       gdas_marineanlvar                   159919534           SUCCEEDED                   0         1          70.0
202103241800     gdas_marineanlchkpt                   159920276           SUCCEEDED                   0         1          37.0
202103241800     gdas_marineanlfinal                   159920925           SUCCEEDED                   0         1          32.0

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @guillaumevernieres !

@guillaumevernieres
Copy link
Contributor Author

@AndrewEichmann-NOAA feel free to push your wcoss updates here if there are more (as a PR into my branch).

aerorahul
aerorahul previously approved these changes Nov 7, 2024
Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me.

@emcbot
Copy link

emcbot commented Dec 7, 2024

Experiment C96_S2SWA_gefs_replay_ics FAILED on Hera in Build# 3 with error logs:

/scratch1/NCEPDEV/global/CI/3041/RUNTESTS/COMROOT/C96_S2SWA_gefs_replay_ics_28290e26/logs/2020110100/gefs_stage_ic.log

Follow link here to view the contents of the above file(s): (link)

@emcbot
Copy link

emcbot commented Dec 7, 2024

Experiment C96_S2SWA_gefs_replay_ics FAILED on Hera in Build# 3 in
/scratch1/NCEPDEV/global/CI/3041/RUNTESTS/EXPDIR/C96_S2SWA_gefs_replay_ics_28290e26

@RussTreadon-NOAA
Copy link
Contributor

gefs_stage_ic.log failed in C48_S2SWA_gefs and C96_S2SWA_gefs_replay_ics with the following traceback

Traceback (most recent call last):
  File "/scratch1/NCEPDEV/global/CI/3041/gefs/scripts/exglobal_stage_ic.py", line 46, in <module>
    main()
  File "/scratch1/NCEPDEV/global/CI/3041/gefs/ush/python/wxflow/logger.py", line 266, in wrapper
    retval = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/scratch1/NCEPDEV/global/CI/3041/gefs/scripts/exglobal_stage_ic.py", line 34, in main
    stage_dict[key] = stage.task_config[key]
                      ~~~~~~~~~~~~~~~~~^^^^^
  File "/scratch1/NCEPDEV/global/CI/3041/gefs/ush/python/wxflow/attrdict.py", line 84, in __missing__
    raise KeyError(name)
KeyError: 'DO_STARTMEM_FROM_JEDIICE'
+ JGLOBAL_STAGE_IC[1]: postamble JGLOBAL_STAGE_IC 1733601701 1

This PR adds DO_STARTMEM_FROM_JEDIICE to parm/config/gfs/config.base.

Looks like we need DO_STARTMEM_FROM_JEDIICE to parm/config/gefs/config.base.

Does this make sense @guillaumevernieres , @WalterKolczynski-NOAA , and @aerorahul ?

@RussTreadon-NOAA RussTreadon-NOAA added CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Dec 8, 2024
@emcbot emcbot added CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera labels Dec 8, 2024
@emcbot
Copy link

emcbot commented Dec 8, 2024

Experiment C96C48_hybatmaerosnowDA FAILED on Hera in Build# 3 in
/scratch1/NCEPDEV/global/CI/3041/RUNTESTS/EXPDIR/C96C48_hybatmaerosnowDA_28290e26

@emcbot
Copy link

emcbot commented Dec 8, 2024

Experiment C96C48_ufs_hybatmDA FAILED on Hera in Build# 3 in
/scratch1/NCEPDEV/global/CI/3041/RUNTESTS/EXPDIR/C96C48_ufs_hybatmDA_28290e26

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera labels Dec 8, 2024
@emcbot
Copy link

emcbot commented Dec 8, 2024

CI Failed on Hera in Build# 3
Built and ran in directory /scratch1/NCEPDEV/global/CI/3041


Experiment C96C48_ufs_hybatmDA_28290e26 Terminated with  tasks failed and  dead at Sun Dec  8 01:18:24 UTC 2024
Experiment C96C48_ufs_hybatmDA_28290e26 Terminated: **
Experiment C96C48_hybatmaerosnowDA_28290e26 Terminated with  tasks failed and  dead at Sun Dec  8 01:18:24 UTC 2024
Experiment C96C48_hybatmaerosnowDA_28290e26 Terminated: **

@emcbot emcbot added the CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress label Dec 8, 2024
@RussTreadon-NOAA
Copy link
Contributor

As a test I copied @TerrenceMcGuinness-NOAA EXPDIR and ROTDIR for the GEFS CI. Files were edited to run under my account. export DO_STARTMEM_FROM_JEDIICE="NO" was added to config.base.

# Shared parameters
# DA engine
export DO_JEDIATMVAR="NO"
export DO_JEDIATMENS="NO"
export DO_JEDIOCNVAR="NO"
export DO_JEDISNOWDA="NO"
export DO_MERGENSST="NO"
export DO_STARTMEM_FROM_JEDIICE="NO"

The failed stage job was rewound and rebooted. The rerun was successful. Both CI streams ran to completion on Hera.

/scratch1/NCEPDEV/stmp2/Russ.Treadon/EXPDIR/C48_S2SWA_gefs_pr3041
   CYCLE         STATE           ACTIVATED              DEACTIVATED
202103231200        Done    Dec 07 2024 20:01:14    Dec 08 2024 03:05:17

/scratch1/NCEPDEV/stmp2/Russ.Treadon/EXPDIR/C96_S2SWA_gefs_replay_ics_pr3041
   CYCLE         STATE           ACTIVATED              DEACTIVATED
202011010000        Done    Dec 07 2024 20:01:10    Dec 08 2024 01:35:21

Note: When ./workflow/create_experiment.py was run for C48_S2SWA_gefs using dba159c the resulting config.base contained

export DO_MERGENSST="NO"
export DO_STARTMEM_FROM_JEDIICE="@DO_STARTMEM_FROM_JEDIICE@"

229e791 explicitly set DO_STARTMEM_FROM_JEDIICE="NO" following the example of other DO_JEDI variables in gefs/config.base.

This may not be the correct way to set DO_STARTMEM_FROM_JEDIICE in gefs/config.base. The better way may be set add DO_STARTMEM_FROM_JEDIICE: "NO" to the base: section of parm/config/gefs/yaml/defaults.yaml.

I am not familiar with the gefs jobs so I simply followed the example in parm/config/gefs/config.base

@emcbot emcbot added CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Dec 8, 2024
@emcbot
Copy link

emcbot commented Dec 8, 2024

CI Passed on Hera in Build# 4
Built and ran in directory /scratch1/NCEPDEV/global/CI/3041


**CI Failed** on Hera in Build# 3<br>Built and ran in directory `/scratch1/NCEPDEV/global/CI/3041`

Experiment C96C48_ufs_hybatmDA_28290e26 Terminated with tasks failed and dead at Sun Dec 8 01:18:24 UTC 2024
Experiment C96C48_ufs_hybatmDA_28290e26 Terminated: **
Experiment C96C48_hybatmaerosnowDA_28290e26 Terminated with tasks failed and dead at Sun Dec 8 01:18:24 UTC 2024
Experiment C96C48_hybatmaerosnowDA_28290e26 Terminated: **
Experiment C48mx500_3DVarAOWCDA_229e7916 Completed 2 Cycles: SUCCESS at Sun Dec 8 03:41:17 UTC 2024
Experiment C48_ATM_229e7916 Completed 2 Cycles: SUCCESS at Sun Dec 8 03:41:17 UTC 2024
Experiment C48mx500_hybAOWCDA_229e7916 Completed 2 Cycles: SUCCESS at Sun Dec 8 03:41:18 UTC 2024
Experiment C96_S2SWA_gefs_replay_ics_229e7916 Completed 1 Cycles: SUCCESS at Sun Dec 8 03:53:58 UTC 2024
Experiment C96C48_hybatmaerosnowDA_229e7916 Completed 3 Cycles: SUCCESS at Sun Dec 8 04:48:21 UTC 2024
Experiment C96_atm3DVar_229e7916 Completed 3 Cycles: SUCCESS at Sun Dec 8 04:54:25 UTC 2024
Experiment C96C48_hybatmDA_229e7916 Completed 3 Cycles: SUCCESS at Sun Dec 8 04:54:26 UTC 2024
Experiment C96C48_ufs_hybatmDA_229e7916 Completed 3 Cycles: SUCCESS at Sun Dec 8 05:37:06 UTC 2024
Experiment C48_S2SW_229e7916 Completed 2 Cycles: SUCCESS at Sun Dec 8 05:56:12 UTC 2024
Experiment C48_S2SWA_gefs_229e7916 Completed 1 Cycles: SUCCESS at Sun Dec 8 06:26:50 UTC 2024

@RussTreadon-NOAA
Copy link
Contributor

@TerrenceMcGuinness-NOAA : emcbot added the CI-Hera-Passed label but the message above indicates that C96C48_ufs_hybatmDA_28290e26 and C96C48_hybatmaerosnowDA_28290e26 failed. When I check /scratch1/NCEPDEV/global/CI/3041 the directory is empty.

I'm glad to see the C-Hera-Passed label. This agrees with what I find when I set up and run g-w CI. I'm confused by the Passed result and message with Failed streams.

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g-w CI passes on multiple platforms.

Approve.

@RussTreadon-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA and @aerorahul : What else is required in terms of reviews and/or g-w CI for this PR? We would like to merge this PR into g-w develop this early this week.

@WalterKolczynski-NOAA WalterKolczynski-NOAA merged commit 6585798 into NOAA-EMC:develop Dec 9, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OCNRES is an integer in the config, not a string. Fix the marine EnVAR
8 participants