Batching experiments together #858

scheibelp · 2025-06-30T20:22:35Z

Closes #485

Say you've created several experiments in several Benchpark/Ramble workspaces:

benchpark system init --dest elcap llnl-elcapitan cluster=elcapitan
benchpark experiment init --dest amg2023 amg2023+rocm
benchpark experiment init --dest kripke kripke+rocm
...
bin/benchpark setup amg2023/ elcap/ workspace1/
bin/benchpark setup kripke/ elcap/ workspace2/
...
ramble --workspace-dir `pwd`/workspace1/amg2023/elcap/workspace workspace setup

(i.e. you've run ramble workspace setup several times).

This adds a script called aggregate.py, which you run like:

$ python aggregate.py groups workspace1/ workspace2/
# workspace1 and workspace2 are benchpark workspaces
# "groups" is a pathway to a new directory this command will create

this finds all execute_experiment scripts in each workspace, and partitions them based on their batch requests: all execute_experiment scripts with the same batch allocation are placed together in a single script inside of the specified groups directory like:

$ head groups/*
==> groups/0.sh <==
# flux: -N 2
.../workspace/amg2023/elcap/workspace1/experiments/amg2023/problem1/amg2023_problem1_single_node_rocm_caliper_none_2_2_2_80_80_80_8/execute_experiment

==> groups/1.sh <==
# flux: -N 1
.../workspace/kripke/elcap/workspace2/experiments/kripke/kripke/kripke_kripke_single_node_rocm_caliper_none_64_1_128_128_4_2_2_1_64_64_32_4/execute_experiment
/a/path/to/some/other/execute_experiment

==> groups/2.sh <==
# flux: -N 8
.../workspace/lammps/elcap/workspace1/experiments/lammps/hns-reaxff/lammps_hns-reaxff_single_node_rocm_20_40_32_8_8_64/execute_experiment

… batch request (so e.g. all 1-node requests will go into one script)

michaelmckinsey1 · 2025-07-25T18:11:31Z

@scheibelp Could you do python aggregate.py groups workspace1/ workspace1/ to run the same experiment in the same allocation like 2 trials?

scheibelp · 2025-07-25T18:39:27Z

Could you do python aggregate.py groups workspace1/ workspace1/ to run the same experiment in the same allocation like 2 trials?

It will do that (although that wasn't my original intent), but if you rerun the exact same experiment I think it keeps dumping its output (e.g. data including FOM) to the same file, so results from all but the last run would be lost. The experiment template could be rewritten to potentially distinguish output for successive runs.

If workspace1 and workspace2 contain the same experiment then aggregate.py would run both instances and the results would be distinct (e.g. if you had run benchpark setup systemx experimenty workspace1; benchpark setup systemx experimenty workspace2).

pearce8 · 2025-07-28T21:34:21Z

@scheibelp we also need this described in the docs.

michaelmckinsey1 · 2025-11-11T03:27:15Z

superseded by #1036

scheibelp added 3 commits June 27, 2025 16:25

partial script for aggregation

22cc9df

bugfix and look for just experiment scripts

095d64b

completed script for flux: generates 1 grouped script for each unique…

20f99cc

… batch request (so e.g. all 1-node requests will go into one script)

scheibelp marked this pull request as draft June 30, 2025 20:22

scheibelp added 2 commits June 30, 2025 13:28

rm debugging code

e5bd59b

add comment about handling schedulers besides flux

bbda84e

scheibelp requested a review from pearce8 June 30, 2025 20:37

pearce8 added the changes requested Changes requested label Jul 28, 2025

michaelmckinsey1 mentioned this pull request Sep 5, 2025

Expose n_repeats as experiment variant #1035

Merged

3 tasks

Merge branch 'develop' into aggregate-batch

d5e05c5

michaelmckinsey1 mentioned this pull request Sep 5, 2025

Batching experiments together #1036

Open

7 tasks

michaelmckinsey1 closed this Nov 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batching experiments together #858

Batching experiments together #858

Uh oh!

scheibelp commented Jun 30, 2025

Uh oh!

michaelmckinsey1 commented Jul 25, 2025

Uh oh!

scheibelp commented Jul 25, 2025

Uh oh!

pearce8 commented Jul 28, 2025

Uh oh!

michaelmckinsey1 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Batching experiments together #858

Batching experiments together #858

Uh oh!

Conversation

scheibelp commented Jun 30, 2025

Uh oh!

michaelmckinsey1 commented Jul 25, 2025

Uh oh!

scheibelp commented Jul 25, 2025

Uh oh!

pearce8 commented Jul 28, 2025

Uh oh!

michaelmckinsey1 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants