Skip to content

Conversation

@mikemhenry
Copy link
Contributor

Checklist

  • Added a news entry

Developers certificate of origin

@codecov
Copy link

codecov bot commented Feb 28, 2025

Codecov Report

❌ Patch coverage is 10.00000% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.51%. Comparing base (192b582) to head (8a8b718).
⚠️ Report is 239 commits behind head on main.

Files with missing lines Patch % Lines
openfe/tests/protocols/conftest.py 11.11% 16 Missing ⚠️
...enfe/tests/protocols/openmm_ahfe/test_ahfe_slow.py 0.00% 1 Missing ⚠️
...tests/protocols/openmm_rfe/test_hybrid_top_slow.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1170      +/-   ##
==========================================
- Coverage   94.66%   92.51%   -2.16%     
==========================================
  Files         143      143              
  Lines       10994    11012      +18     
==========================================
- Hits        10408    10188     -220     
- Misses        586      824     +238     
Flag Coverage Δ
fast-tests 92.51% <10.00%> (?)
slow-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mikemhenry
Copy link
Contributor Author

mikemhenry commented Feb 28, 2025

"NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running."

Good to know! Re-running now

@mikemhenry
Copy link
Contributor Author

@mikemhenry
Copy link
Contributor Author

large worked but timed out after 12 hours (which we can set up to 1 week) -- I will try non-intergration tests since AFAIK that is what @IAlibay is trying to run -- just the slow tests.

@IAlibay
Copy link
Member

IAlibay commented Mar 5, 2025

large worked but timed out after 12 hours (which we can set up to 1 week) -- I will try non-intergration tests since AFAIK that is what @IAlibay is trying to run -- just the slow tests.

Yeah runninng the "integration" tests is probably overkill without a GPU.

@mikemhenry
Copy link
Contributor Author

large:

============================= slowest 10 durations =============================
2655.53s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzoic_to_benzene_mapping-0-1-False-11-1-3]
2496.48s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzoic_to_benzene_mapping-0-0-True-14-1-3]
2480.21s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_aniline_mapping-0-1-False-11-4-1]
2453.59s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_benzoic_mapping-0--1-False-11-3-1]
2337.46s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_benzoic_mapping-0-0-True-14-3-1]
2298.25s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[aniline_to_benzene_mapping-0-0-True-14-1-4]
2239.40s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex[sams]
2214.30s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[aniline_to_benzene_mapping-0--1-False-11-1-4]
2173.35s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex[repex]
2111.31s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex[independent]
=========================== short test summary info ============================
FAILED openfe/tests/utils/test_system_probe.py::test_probe_system_smoke_test - subprocess.CalledProcessError: Command '['nvidia-smi', '--query-gpu=gpu_uuid,gpu_name,compute_mode,pstate,temperature.gpu,utilization.memory,memory.total,driver_version,', '--format=csv']' returned non-zero exit status 9.
FAILED openfe/tests/protocols/test_openmm_rfe_slow.py::test_openmm_run_engine[CUDA] - openmm.OpenMMException: Error initializing CUDA: CUDA_ERROR_NO_DEVICE (100) at /home/conda/feedstock_root/build_artifacts/openmm_1726255919104/work/platforms/cuda/src/CudaContext.cpp:91
= 2 failed, 912 passed, 31 skipped, 2 xfailed, 3 xpassed, 1913 warnings, 3 rerun in 24749.25s (6:52:29) =

@mikemhenry
Copy link
Contributor Author

xlarge

============================= slowest 10 durations =============================
2509.67s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex[repex]
2237.81s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex[sams]
2151.15s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex[independent]
1884.45s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[aniline_to_benzene_mapping-0-0-True-14-1-4]
1808.82s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_aniline_mapping-0-1-False-11-4-1]
1451.05s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_benzoic_mapping-0-0-True-14-3-1]
1449.02s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[aniline_to_benzene_mapping-0--1-False-11-1-4]
1399.31s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_many_molecules_solvent
1388.60s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzoic_to_benzene_mapping-0-0-True-14-1-3]
1313.94s call     openfe/tests/protocols/test_openmm_equil_rfe_protocols.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_benzoic_mapping-0--1-False-11-3-1]
=========================== short test summary info ============================
FAILED openfe/tests/utils/test_system_probe.py::test_probe_system_smoke_test - subprocess.CalledProcessError: Command '['nvidia-smi', '--query-gpu=gpu_uuid,gpu_name,compute_mode,pstate,temperature.gpu,utilization.memory,memory.total,driver_version,', '--format=csv']' returned non-zero exit status 9.
FAILED openfe/tests/protocols/test_openmm_rfe_slow.py::test_openmm_run_engine[CUDA] - openmm.OpenMMException: Error initializing CUDA: CUDA_ERROR_NO_DEVICE (100) at /home/conda/feedstock_root/build_artifacts/openmm_1726255919104/work/platforms/cuda/src/CudaContext.cpp:91
= 2 failed, 912 passed, 31 skipped, 2 xfailed, 3 xpassed, 1978 warnings, 3 rerun in 11132.77s (3:05:32) =

@mikemhenry
Copy link
Contributor Author

better than 2x improvement

@mikemhenry
Copy link
Contributor Author

Last check, going to see if the intel flavor is any faster

@IAlibay
Copy link
Member

IAlibay commented Mar 13, 2025

@mikemhenry what flags are you using for these CPU runners? --runslow or --integration too? 3h seems wayy too long for just the slow tests.

@mikemhenry
Copy link
Contributor Author

integration as well -- I wanted to get some benchmarking data on the integration tests without a GPU

@mikemhenry
Copy link
Contributor Author

mikemhenry commented Mar 13, 2025

I actually turned off integration tests back in 98cec71

@mikemhenry
Copy link
Contributor Author

But you right, that is kinda slow for just the slow tests

@mikemhenry
Copy link
Contributor Author

Now the runners are running out of disk space when installing the env, need to check if there are new deps making the env bigger or something else going on. I can also increase the EBS image size.

@mikemhenry
Copy link
Contributor Author

@mikemhenry
Copy link
Contributor Author

Sweet, getting:
FAILED openfe/tests/protocols/test_openmm_rfe_slow.py::test_openmm_run_engine[CUDA] - openmm.OpenMMException: Error initializing CUDA: CUDA_ERROR_NO_DEVICE (100) at /home/conda/feedstock_root/build_artifacts/openmm_1726255919104/work/platforms/cuda/src/CudaContext.cpp:91
But we expect that to fail, I am not sure why we are running this test since we only have OFE_SLOW_TESTS: "true" but no integration tests turned on, and it has a mark @pytest.mark.integration

@mikemhenry
Copy link
Contributor Author

timing info btw
= 5 failed, 936 passed, 28 skipped, 2 xfailed, 3 xpassed, 2010 warnings, 3 rerun in 11167.03s (3:06:07) =

@mikemhenry
Copy link
Contributor Author

@IAlibay how do you invoke the tests?
$ CUDA_VISIBLE_DEVICES="" pytest -n 2 -vv --durations=10 --runslow openfecli/tests/ openfe/tests/ this is taking more than minutes on my laptop

@IAlibay
Copy link
Member

IAlibay commented May 30, 2025

@IAlibay how do you invoke the tests? $ CUDA_VISIBLE_DEVICES="" pytest -n 2 -vv --durations=10 --runslow openfecli/tests/ openfe/tests/ this is taking more than minutes on my laptop

Testing right now with the CUDA_VISIBLE_DEVICES being set.

@IAlibay
Copy link
Member

IAlibay commented May 30, 2025

@mikemhenry it runs in 35 mins for me

@mikemhenry
Copy link
Contributor Author

While we wait for that, looking at the slowest runs:

 ============================= slowest 10 durations =============================
2383.65s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_aniline_mapping-0-1-False-11-4-1]
1811.27s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[aniline_to_benzene_mapping-0--1-False-11-1-4]
1800.93s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_benzoic_mapping-0-0-True-14-3-1]
1755.66s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzoic_to_benzene_mapping-0-0-True-14-1-3]
1676.54s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_benzoic_mapping-0--1-False-11-3-1]
1659.19s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[aniline_to_benzene_mapping-0-0-True-14-1-4]
1619.34s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzoic_to_benzene_mapping-0-1-False-11-1-3]
1465.10s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_many_molecules_solvent
1314.54s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex[repex]
1313.68s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex[independent]

Should we move some of these tests to integration tests? I was thinking we could print all the test durations and see what it looks like to figure out which ones we should move. We could also mark them as needing a GPU or something. This reminds me of #1133

@IAlibay
Copy link
Member

IAlibay commented May 30, 2025

While we wait for that, looking at the slowest runs:

 ============================= slowest 10 durations =============================
2383.65s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_aniline_mapping-0-1-False-11-4-1]
1811.27s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[aniline_to_benzene_mapping-0--1-False-11-1-4]
1800.93s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_benzoic_mapping-0-0-True-14-3-1]
1755.66s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzoic_to_benzene_mapping-0-0-True-14-1-3]
1676.54s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzene_to_benzoic_mapping-0--1-False-11-3-1]
1659.19s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[aniline_to_benzene_mapping-0-0-True-14-1-4]
1619.34s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex_alchemwater_totcharge[benzoic_to_benzene_mapping-0-1-False-11-1-3]
1465.10s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_many_molecules_solvent
1314.54s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex[repex]
1313.68s call     openfe/tests/protocols/openmm_rfe/test_hybrid_top_protocol.py::test_dry_run_complex[independent]

Should we move some of these tests to integration tests? I was thinking we could print all the test durations and see what it looks like to figure out which ones we should move. We could also mark them as needing a GPU or something. This reminds me of #1133

@mikemhenry a lot of these are significantly faster with #1131 - we were meant to use the output of this PR to test out that PR.

@IAlibay
Copy link
Member

IAlibay commented May 30, 2025

While we wait for that, looking at the slowest runs:

@mikemhenry see above - 35 mins locally.

@mikemhenry
Copy link
Contributor Author

Okay so then it sounds like we should merge this one in, @IAlibay @atravitz can I get a review? And which AWS instance do we want to use?

@mikemhenry
Copy link
Contributor Author

While we wait for that, looking at the slowest runs:

@mikemhenry see above - 35 mins locally.

Was this with n=2? We can profile more to figure it out, but maybe we merge it in then use it to test #1131 and see what it looks like?

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to block because it's Friday and I'm rather unsure as to what should be happening here.

@mikemhenry I think I just don't understand what we're trying to achieve here. The plan was for this to be a "fast" way to run slow tests. As it stands this AWS CPU runner is slower than running the package install tests (which also run slow tests?) by nearly 3x. Unless I'm missing something, this seems kinda not worth it?

@github-actions
Copy link

No API break detected ✅

@mikemhenry
Copy link
Contributor Author

I was using the wrong instance family for this. The T-series have "burst" CPU but are not meant for constant use. If we use a c7i.xlarge it finishes in 1.32 hours (about twice as fast as a github runner) and costs $0.1785 an hour for a total cost of $0.24

@mikemhenry mikemhenry mentioned this pull request Jun 5, 2025
2 tasks
@mikemhenry
Copy link
Contributor Author

@IAlibay are we happy with this now?

@mikemhenry
Copy link
Contributor Author

offline we discussed that we will merge this in and then test irfans PR that speeds up CI, then figure out if we want to optimize this further with a large instance

@github-actions
Copy link

No API break detected ✅

1 similar comment
@github-actions
Copy link

No API break detected ✅

OFE_SLOW_TESTS: "true"
DUECREDIT_ENABLE: 'yes'
OFE_INTEGRATION_TESTS: FALSE
OFE_INTEGRATION_TESTS: TRUE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think this should be false until we figure out the GPU stuff?

@github-actions
Copy link

No API break detected ✅

@mikemhenry
Copy link
Contributor Author

https://github.com/OpenFreeEnergy/openfe/actions/runs/15592958046

testing it here before we merge it in

@mikemhenry mikemhenry merged commit 81cfa46 into main Jun 12, 2025
9 of 11 checks passed
@github-actions
Copy link

No API break detected ✅

@mikemhenry mikemhenry deleted the feat/test_larger_cpu_runner branch June 12, 2025 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants