-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logging support in SCREAM #1520
Conversation
feee089
to
e4c5bfa
Compare
There's still something to address, and I need some opinions here. The current implementation allows our C++ code to write to the atm log file, but all Fortran routines still write to screen. In particular, Homme, is still printing all its output to screen, and it's quite a bit of output. Unfortunately, I don't know if we can fix this, even if we managed to make C++ and Fortran use the same descriptor for the log file. Sharing descriptors between C and Fortran is not recommended, and I'm afraid we could mess up our logs. For instance, we may have C overwriting the last chunk written by F90, since it wasn't aware of another access to that same file. While I could modify the C++ logger, to make it "refresh" the file, to ensure it's not buffering the EOF position, it gets tricky with the Fortran side, where, even if I had a way to "refresh" the EOF position, I would have to modify tens/hundreds of calls to I'm not sure how we should address this. A mixed C/Fortran program that needs to write to the same file from both C and Fortran is a bit of a niche problem, so you don't find much help online. I could try to play around a bit with a toy program, and see what happens, but my knowledge here is not very strong, and I might just "get lucky" in my test, and leave no "standard-backed" guarantee that it will work. What do you guys think? Tagging some v1/AD folks: @jgfouca @jeff-cohere @tcclevenger @PeterCaldwell @AaronDonahue @brhillman |
Includes purging unused vars from a fcn signature in scream_rrtmgp_interface.cpp
Most of these are options that were moved to EKAT when we created it.
e4c5bfa
to
8bef725
Compare
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Using Repos:
Pull Request Author: bartgol |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 4 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Blake # 3027 (click to expand)
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Weaver # 2918 (click to expand)
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Mappy # 2268 (click to expand)
|
As far as I know, there's no standard way to share file descriptors between the two languages. There may be ways that work on certain platforms and/or compilers, but apart from that I think we're on our own. Does Fortran even offer access to the OS file descriptors it uses? A standard approach would involve replacing Fortran |
Yes, the ids of the descriptors are available. But replacing all calls to write with bridges to other functions is a mess, and not scalable. Besides, it requries to change Homme, on the premise that it is used inside a non-Fortran context. I would really like to avoid that path. I fear we have to accept having all Homme output either suppressed, or written to screen (hence, to e3sm.log) rather than to atm.log. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Using Repos:
Pull Request Author: bartgol |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 4 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Blake # 3028 (click to expand)
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Weaver # 2919 (click to expand)
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Mappy # 2269 (click to expand)
|
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job SCREAM_PullRequest_Autotester_Mappy to start: Total Wait = 1803
|
Oh man, that sucks. The fact that we're trying to write logs from 2 different languages didn't cross my mind. I think it's fine for F90 to write to e3sm.log and c++ to write to atm.log for the v1 release - we're trying to create something that works now, not something that is perfect. This isn't an acceptable long-term solution though. One question is: why doesn't HOMME write to e3sm.log in EAMf90 runs? At least I'm assuming it doesn't... It may be acceptable as a long-term solution to have both an atm.cxx.log and an atm.f90.log file... I'm also unclear why replacing all printf and write statements with wrappers wouldn't be scalable (though I agree it would be really annoying). Maybe we could search/replace all write and printf statements with my_write and my_printf which just call the intrinsic write and print for fortran builds and call C++ versions that take the same syntax for C++ builds? |
Your search/replace idea is practical for languages for which |
Homme writes to atm.log, since it has access to the F90 file descriptor. As I said, we could try to do the same when Homme runs through EAMxx (I would need a micro PR in E3SM, but that's doable). The issue is having 2 output streams open to the same file, one in F90 and one in CXX. While we do have some control on the CXX stream, I don't think we can guarantee anything regarding the F90 buffer status when we write. I should probably dig a bit on this, although as I mentioned, the online documentation for "a mixed C/F90 program, with a persistent buffer in both language open for write" is scarse (at best). And working on a toy problem might be misleading, since I suspect some sync issues are obvious only when running real jobs, with messages of variable size, which can fill up the F90 buffer quickly but the CXX one slowly (or viceversa). Of course, there's also the alternative of assigning Homme to a blackhole stream. Or even a separate log file (like |
Shared I/O between C++ and Fortran scares me a bit. But at least Fortran 2003 does seem to let you explicitly flush your writes. |
Yeah, I'm kind of creeped out by shared C++/Fortran writes too. I'd suggest just letting dyn write its own log file and being done with it for the moment. |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 4 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Blake # 3037 (click to expand)
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Weaver # 2927 (click to expand)
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Mappy # 2275 (click to expand)
|
Blake is currently experiencing some issue, with commands taking a long time to complete on the login node, causing timeout during the git clone phase of our jenkins jobs. I manually tested this PR on a compute node, and all tests pass. Mappy and weaver are already passing. @PeterCaldwell @AaronDonahue If you are ok with the current status of the PR, feel free to manually merge. |
I'm happy with the PR from the discussions we've had, but don't have bandwidth to read through it in depth before merging it. If you want me to merge it without reading, I'm happy to do that. Otherwise I will let it sit until Jim or Aaron gets to it... |
No, merging without review is not appropriate, I think. I added JIm as reviewer. As soon as one between him and Aaron can take a look, we can move forward. |
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job SCREAM_PullRequest_Autotester_Weaver to start: Total Wait = 1803
|
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job SCREAM_PullRequest_Autotester_Mappy to start: Total Wait = 1803
|
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - GitHub reports Mergeable status = False |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Using Repos:
Pull Request Author: bartgol |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 4 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Blake # 3061 (click to expand)
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Weaver # 2944 (click to expand)
Console Output (last 100 lines) : SCREAM_PullRequest_Autotester_Mappy # 2290 (click to expand)
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Using Repos:
Pull Request Author: bartgol |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Blake
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ jgfouca ]! |
Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged |
Merge on Pull Request# 1520: IS A SUCCESS - Pull Request successfully merged |
This PR adds support for logging to screen/file in SCREAM. For CIME runs, it ensures that the log ends up in the atm.log file.
Closes #1502.