Skip to content

Conversation

@james-bruten-mo
Copy link
Collaborator

@james-bruten-mo james-bruten-mo commented Jan 20, 2026

PR Summary

Sci/Tech Reviewer: @MatthewHambley
Code Reviewer: @t00sa

At the git migration incremental builds on the command line no longer worked as timestamps in github are based on clone date, not commit date. This reenables this functionality by fetching/pulling changes if the build is being rerun. This has been tested on the command line for both a run with physics code (lfric_atm) and without (gungho_model). The 2nd build time is much quicker as expected.

This also enables building using the local github mirrors which will be useful once these are also available on the HPC. Again, these have been tested on the cli and with incremental builds.

This change adds get_git_sources which is mostly a copy of the same file from SimSys_Scripts. Having 2 copies isn't ideal, but I can't think of another way to make that SimSys_Scripts file available when building locally (short of installing it as a library). Hopefully this is something that'll be improved by fab!

Code Quality Checklist

  • I have performed a self-review of my own code
  • My code follows the project's style guidelines
  • Comments have been included that aid understanding and enhance the readability of the code
  • My changes generate no new warnings
  • All automated checks in the CI pipeline have completed successfully

Testing

  • I have tested this change locally, using the LFRic Core rose-stem suite
  • If required (e.g. API changes) I have also run the LFRic Apps test suite using this branch
  • If any tests fail (rose-stem or CI) the reason is understood and acceptable (e.g. kgo changes)
  • I have added tests to cover new functionality as appropriate (e.g. system tests, unit tests, etc.)
  • Any new tests have been assigned an appropriate amount of compute resource and have been allocated to an appropriate testing group (i.e. the developer tests are for jobs which use a small amount of compute resource and complete in a matter of minutes)

trac.log

Test Suite Results - lfric_apps - test_incremental_builds_change/run2

Suite Information

Item Value
Suite Name test_incremental_builds_change/run2
Suite User james.bruten
Workflow Start 2026-01-20T15:03:56
Groups Run developer
Dependency Reference Main Like
casim MetOffice/casim@2025.12.1 True
jules MetOffice/jules@2025.12.1 True
lfric_apps james-bruten-mo/lfric_apps@improve_local_builds False
lfric_core MetOffice/lfric_core@aa32824 True
moci MetOffice/moci@2025.12.1 True
SimSys_Scripts MetOffice/SimSys_Scripts@2025.12.1 True
socrates MetOffice/socrates@2025.12.1 True
socrates-spectral MetOffice/socrates-spectral@2025.12.1 True
ukca MetOffice/ukca@2025.12.1 True

Task Information

✅ succeeded tasks - 1106

Security Considerations

  • I have reviewed my changes for potential security issues
  • Sensitive data is properly handled (if applicable)
  • Authentication and authorisation are properly implemented (if applicable)

Performance Impact

  • Performance of the code has been considered and, if applicable, suitable performance measurements have been conducted

AI Assistance and Attribution

  • Some of the content of this change has been produced with the assistance of Generative AI tool name (e.g., Met Office Github Copilot Enterprise, Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the Simulation Systems AI policy (including attribution labels)

Documentation

  • Where appropriate I have updated documentation related to this change and confirmed that it builds correctly

PSyclone Approval

  • If you have edited any PSyclone-related code (e.g. PSyKAl-lite, Kernel interface, optimisation scripts, LFRic data structure code) then please contact the TCD Team

Sci/Tech Review

  • I understand this area of code and the changes being added
  • The proposed changes correspond to the pull request description
  • Documentation is sufficient (do documentation papers need updating)
  • Sufficient testing has been completed

(Please alert the code reviewer via a tag when you have approved the SR)

Code Review

  • All dependencies have been resolved
  • Related Issues have been properly linked and addressed
  • CLA compliance has been confirmed
  • Code quality standards have been met
  • Tests are adequate and have passed
  • Documentation is complete and accurate
  • Security considerations have been addressed
  • Performance impact is acceptable

@github-actions github-actions bot added the cla-signed This contributor has signed the CLA. label Jan 20, 2026
@james-bruten-mo james-bruten-mo removed the cla-signed This contributor has signed the CLA. label Jan 20, 2026
@james-bruten-mo james-bruten-mo marked this pull request as ready for review January 20, 2026 20:59
@james-bruten-mo james-bruten-mo mentioned this pull request Jan 21, 2026
28 tasks
@github-actions github-actions bot added the cla-modified The CLA has been modified as part of this PR - added by GA label Jan 21, 2026
@james-bruten-mo james-bruten-mo removed the cla-modified The CLA has been modified as part of this PR - added by GA label Jan 21, 2026
Copy link

@MatthewHambley MatthewHambley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some improvements possible to structure and code. Also some lack of clarity regarding why certain things are done in certain ways.

"""

tempdir = Path(tempfile.mkdtemp())
use_mirrors: bool = (os.getenv('LOCAL_BUILD_MIRRORS', 'False') == 'True')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For something like this it might be best to test merely on the presence of the variable. This avoids the game of guess the magic value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use presence of the path


tempdir = Path(tempfile.mkdtemp())
use_mirrors: bool = (os.getenv('LOCAL_BUILD_MIRRORS', 'False') == 'True')
mirror_loc: Path = os.getenv("MIRROR_LOC", "")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, in fact, the presence of the variable could signal the requirement while it's content is the location. And how about whole words while we're at it? MIRROR_LOCATION

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, happy to use that here

Comment on lines 52 to 57
if not mirror_loc and use_mirrors:
raise KeyError(
"Use Mirrors is set true, but the MIRROR_LOC environment variable hasn't"
"been set"
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the above, there is no longer a need for this test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rep, removed

# which you should have received as part of this distribution.
# *****************************COPYRIGHT*******************************
"""
Clone sources for a rose-stem run for use with git bdiff module in scripts

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it not also clone them for interactive builds? More generally I don't see any explanation as to why this complicated thing needs to be done.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A copy and paste error, changed

) -> None:

if ".git" in source:
if use_mirrors:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a future change, it may be possible to use a class to wrap repository access. That way the test can be performed once, the correct object instantiated and no further concerns had about which one it is.

Comment on lines +117 to +123
commands = (
f"git -C {loc} init",
f"git -C {loc} remote add origin {repo_source}",
f"git -C {loc} fetch origin {repo_ref}",
f"git -C {loc} checkout FETCH_HEAD",
f"git -C {loc} fetch origin main:main",
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a lot of effort. Doesn't it just recreate a clone?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was initially written to see if we could save space while cloning sources for use in rose-stem. It turns out the difference is very small, but as this was marginally smaller, I left it in. Given it's copied from SimSys_Scripts I don't think it's worth modifying here

# Fetch the main branch from origin
# Ignore errors - these are likely because the main branch already exists
# Instead write them as warnings
command = f"git -C {loc} fetch origin main:main"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to treat the copy as a repository, why not clone it. That's what distributed repositories are all about. You might take advantage of a sparse clone to gain the advantage of filtering you are getting from the rsync exclusion list.

"""

return os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
return Path(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From memory:

Suggested change
return Path(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
Path(__file__).absolute.parent.parent

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ta - absolute is a function but otherwise good

args.working_dir = Path(project_path) / "working"
else:
# If the working dir doesn't end in working, set that here
if not args.working_dir.strip("/").endswith("working"):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is easier to do when working with a Path object. working_dir.name == 'working'. You can use type-Path with add_argument() to have argparse give it the correct type and reject anything which wont parse as a path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

Comment on lines +70 to +75
| ``-m --mirrors`` | False | If True, this will attempt |
| ``store_true`` | | to extract using local |
| | | github mirrors |
+----------------------+-----------------------------+-----------------------------+
| ``--mirror-loc`` | MetOffice Mirror Location | The path to the github |
| | | mirror location |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As previously, why not collapse these into a single argument for mirror location? The need to use mirrors determined by the prevision of a mirror location to use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the environment variables passed to extract_science.py then yes I agree, we can use the presence of the mirror location variable to show whether we're using the mirrors.

For local_build.py (which these docs are referring to) I think we want the --mirrors store_true argument as otherwise we'll need to always pass the location of the mirrors as an argument when using them, which isn't ideal

Copy link
Collaborator Author

@james-bruten-mo james-bruten-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Matthew - those comments are addressed, and for the most part changes done

"""

tempdir = Path(tempfile.mkdtemp())
use_mirrors: bool = (os.getenv('LOCAL_BUILD_MIRRORS', 'False') == 'True')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use presence of the path


tempdir = Path(tempfile.mkdtemp())
use_mirrors: bool = (os.getenv('LOCAL_BUILD_MIRRORS', 'False') == 'True')
mirror_loc: Path = os.getenv("MIRROR_LOC", "")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, happy to use that here

Comment on lines 52 to 57
if not mirror_loc and use_mirrors:
raise KeyError(
"Use Mirrors is set true, but the MIRROR_LOC environment variable hasn't"
"been set"
)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rep, removed

# which you should have received as part of this distribution.
# *****************************COPYRIGHT*******************************
"""
Clone sources for a rose-stem run for use with git bdiff module in scripts
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A copy and paste error, changed

sync_repo(source, ref, dest)


def run_command(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially not - it's a function I tend to copy and paste around to let me set the defaults as I want them, eg. I don't always have to set the text or timeout entries. Given it matches the version on SimSys_Scripts I'm happy to keep it here

Comment on lines 106 to 110
def clone_repo(repo_source: str, repo_ref: str, loc: Path) -> None:
"""
Clone the repo and checkout the provided ref
Only if a remote source
"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +117 to +123
commands = (
f"git -C {loc} init",
f"git -C {loc} remote add origin {repo_source}",
f"git -C {loc} fetch origin {repo_ref}",
f"git -C {loc} checkout FETCH_HEAD",
f"git -C {loc} fetch origin main:main",
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was initially written to see if we could save space while cloning sources for use in rose-stem. It turns out the difference is very small, but as this was marginally smaller, I left it in. Given it's copied from SimSys_Scripts I don't think it's worth modifying here

"""

return os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
return Path(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ta - absolute is a function but otherwise good

Comment on lines +70 to +75
| ``-m --mirrors`` | False | If True, this will attempt |
| ``store_true`` | | to extract using local |
| | | github mirrors |
+----------------------+-----------------------------+-----------------------------+
| ``--mirror-loc`` | MetOffice Mirror Location | The path to the github |
| | | mirror location |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the environment variables passed to extract_science.py then yes I agree, we can use the presence of the mirror location variable to show whether we're using the mirrors.

For local_build.py (which these docs are referring to) I think we want the --mirrors store_true argument as otherwise we'll need to always pass the location of the mirrors as an argument when using them, which isn't ideal

args.working_dir = Path(project_path) / "working"
else:
# If the working dir doesn't end in working, set that here
if not args.working_dir.strip("/").endswith("working"):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants