Skip to content

Conversation

@rfhaque
Copy link
Collaborator

@rfhaque rfhaque commented Sep 5, 2025

This PR adds the Sparta application (docs: Adding a Benchmark)

@github-actions github-actions bot added experiment New or modified experiment application labels Sep 5, 2025
@rfhaque rfhaque requested a review from pearce8 September 5, 2025 20:38
@pearce8 pearce8 marked this pull request as ready for review September 6, 2025 15:07
@pearce8 pearce8 added the changes requested Changes requested label Sep 10, 2025
Copy link
Collaborator

@pearce8 pearce8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Sparta dry run should be passing
  • Add Sparta+openmp and Sparta+rocm to CI on Dane and Tuo or Tioga
  • Since this will likely go in after the Spack 1.0 support, please update the package to meet Spack 1.0 requirements.

Copy link
Collaborator

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this looks compatible with #953)

BENCHMARK: [amg2023, kripke, laghos, raja-perf]
VARIANT: [+rocm]
GPUMODE: SPX
# sparta needs specific variant
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelmckinsey1 What do you mean by "Sparta needs a different version of rocm"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not what this is saying. This benchmark can not be defined the same as the other rocm benchmarks because it requires a specific variant when running with rocm, i.e. fft_kokkos=hipfft. That is why I was thinking this is bad practice to encode this variant in the experiment. I was thinking this is something you would set in the package.py, like depends_on("hipfft", when="rocm"), but after discussing with @scheibelp and @rfhaque yesterday I'm not sure it's so simple.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scheibelp we need help with this

if self.spec.satisfies("+openmp"):
kokkos_mode += "t {n_threads_per_proc}"
if self.spec.satisfies("+rocm") or self.spec.satisfies("+cuda"):
kokkos_mode += " g {n_gpus}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g should probably be 1 -- or n_gpus/m_gpus_per_node

@rfhaque rfhaque mentioned this pull request Oct 6, 2025
21 tasks
Comment on lines 24 to 26
"sparta-snl+openmp aws-pcluster instance_type=c6g.xlarge",
"sparta-snl+openmp aws-pcluster instance_type=c4.xlarge",
"sparta-snl+openmp generic-x86",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI This is to fix dryrun error where this particular experiment is generated asking for too many cores for these particular systems.

@github-actions github-actions bot added the ci CI, unit tests, GitHub actions label Oct 6, 2025
pearce8
pearce8 previously approved these changes Oct 10, 2025
@rfhaque rfhaque added ready for review Ready for review and removed changes requested Changes requested labels Oct 17, 2025
@pearce8
Copy link
Collaborator

pearce8 commented Oct 20, 2025

@michaelmckinsey1 do you know why lint is failing?

@michaelmckinsey1
Copy link
Collaborator

It says [+] Would fix .gitlab/tests/shared_flux_clusters.yml

@michaelmckinsey1
Copy link
Collaborator

michaelmckinsey1 commented Oct 21, 2025

@pearce8 That lint failure requires yamlfix .gitlab/tests/shared_flux_clusters.yml. I have fixed it. I think #1116 will address when to run certain linter commands in the future

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 65.44%. Comparing base (d5e099c) to head (a36561a).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1033      +/-   ##
===========================================
+ Coverage    65.31%   65.44%   +0.12%     
===========================================
  Files           44       44              
  Lines         3241     3241              
  Branches       256      256              
===========================================
+ Hits          2117     2121       +4     
+ Misses        1117     1113       -4     
  Partials         7        7              

see 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pearce8 pearce8 merged commit 58e4c2b into develop Oct 22, 2025
58 checks passed
@pearce8 pearce8 deleted the experiment_sparta branch October 22, 2025 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

application ci CI, unit tests, GitHub actions experiment New or modified experiment ready for review Ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants