v0.23.5
What's New
1. Variable length dataloaders (#3416)
Adds support for dataloaders with rank-dependent lengths. The solution terminates iteration for dataloaders on all ranks when the first dataloader finishes.
Bug Fixed
1. Remove close flush for mosaicml logger (#3446)
Previously, the MosaicML Logger sporadically raised an error when the python interpreter was shutting down as it attempted to flush data on Event.CLOSE
using futures, which cannot be scheduled at that time. Instead, we now only block on finishing existing data upload on Event.CLOSE
, avoiding scheduling new futures.
What's Changed
- Update numpy requirement from <1.27.0,>=1.21.5 to >=1.21.5,<2.1.0 by @dependabot in #3406
- Restore dev version by @karan6181 in #3417
- Save checkpoint to disk for API with new save layout by @eracah in #3399
- Patch PyTorch 2.3.1 by @mvpatel2000 in #3419
- Fixes some typing issues by @dakinggg in #3418
- Fix style by @b-chu in #3420
- Bump coverage[toml] from 7.5.3 to 7.5.4 by @dependabot in #3422
- Update psutil requirement from <6,>=5.8.0 to >=5.8.0,<7 by @dependabot in #3424
- Add support for variable length dataloaders in DDP by @JAEarly in #3416
- Hsdp + MoE CI tests by @KuuCi in #3378
- Bumping MLflow version to 2.14.1 by @JackZ-db in #3425
- Skip HSDP + TP pytests that require torch 2.3 or above by @KuuCi in #3426
- Remove CodeQL workflow by @mvpatel2000 in #3429
- Remove save overwrite by @mvpatel2000 in #3431
- Fixes to TP Docs by @snarayan21 in #3430
- Lower the system metrics logging frequency to reduce MLflow server's load by @chenmoneygithub in #3436
- Update paramiko requirement from <3,>=2.11.0 to >=3.4.0,<4 by @dependabot in #3439
- Bump CI testing version by @mvpatel2000 in #3433
- Fix docstring for EVAL_AFTER_ALL/EVAL_BEFORE_ALL by @mvpatel2000 in #3445
- Remove close flush for mosaicml logger by @mvpatel2000 in #3446
- Remove MosaicMLLambdaEvalClient by @aspfohl in #3432
- Relax hf hub pin by @dakinggg in #3435
- Pytest skip 2 by @KuuCi in #3448
- bump version v0.23.5 by @XiaohanZhangCMU in #3450
Full Changelog: v0.23.4...v0.23.5