Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: add cram wrapper parallel_cram.sh that runs cram tests in parallel #1667

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

corneliusroemer
Copy link
Member

@corneliusroemer corneliusroemer commented Nov 9, 2024

Description of proposed changes

Cram tests are perfect for parallelization. Each test is independend and
we have 178 of them. The wrapper script allows to run tests in parallel.

In my experiments, on an M1 Pro, total test time was reduced
from 7m30s to 1m23s, a more than 5x speedup. I got best results
with -j8 (instead of all 10).

The wrapper takes many of cram's options, one can still run tests of
a single directory, for example ./parallel_cram.sh tests/functional/tree.

One caveat: it seems that iqtree creates files in the input file directory. As multiple tests use the same input files, they might conflict. So we should copy the input files to a temporary directory before running. This is done in commit 687bd04

We seem to be getting almost a 2x speedup in CI as well, as Github runners come with 2 cores by default. The bottleneck is now RSV pathogen CI which runs around 8 minutes (previously cram tests took 13min, now they are faster than RSV). Overall Github runner time is reduced from around 2h5min to 1h15min, a saving of around a third.

I tested the script to ensure that it correctly reports test failures. I've also been using it in my regular work on various PRs and it's worked exactly as expected, saving me waiting time.

Checklist

  • Automated checks pass
  • Check if you need to add a changelog message
  • Check if you need to add tests
  • Check if you need to update docs

…f cram tests

Cram tests are perfect for parallelization. Each test is independend and
we have 178 of them. The wrapper script allows to run tests in parallel.

In my experiments, on an M1 Pro, total test time was reduced
from 7m30s to 1m23s, a more than 5x speedup. I got best results
with `-j8` (instead of all 10).

The wrapper takes many of cram's options, one can still run tests of
a single directory, for example `./parallel_cram.sh tests/functional/tree`.

One caveat: it seems that iqtree creates files in the _input_ file directory.
So we should copy the input files to a temporary directory before running.
@corneliusroemer corneliusroemer marked this pull request as ready for review November 9, 2024 21:54
@corneliusroemer
Copy link
Member Author

TODO:

  • Handle case where a file is passed, rather than a directory, currently this results in:
    ./parallel_cram.sh -j8 tests/functional/parse.t
    Error: Directory tests/functional/parse.t does not exist
    

@victorlin victorlin mentioned this pull request Nov 11, 2024
5 tasks
@tsibley
Copy link
Member

tsibley commented Nov 13, 2024

One caveat: it seems that iqtree creates files in the input file directory. As multiple tests use the same input files, they might conflict. So we should copy the input files to a temporary directory before running. This is done in commit 687bd04

Our tests are revealing a real user interface issue here. Instead of working around it in tests, perhaps we can fix this input directory pollution for good by relocating the temporary alignment file we're already using

tmp_aln_file = str(Path(aln_file).with_name(Path(aln_file).stem + "-delim.fasta"))

into a temporary directory and thus also avoid having to do this junk

augur/augur/tree.py

Lines 304 to 310 in 3f72c40

if clean_up:
if os.path.isfile(tmp_aln_file):
os.remove(tmp_aln_file)
for ext in [".bionj",".ckp.gz",".iqtree",".mldist",".model.gz",".treefile",".uniqueseq.phy",".model"]:
if os.path.isfile(tmp_aln_file + ext):
os.remove(tmp_aln_file + ext)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants