0.18.0
Changelog
- 3c00fc2 chore: bump version: 0.18.0-rc3 -> 0.18.0
- 797ceec docs: add release notes for 0.18.0 (#4102)
- 5051024 chore: bump version: 0.18.0-rc2 -> 0.18.0-rc3
- bd87faa perf: improve plan for proto_get_trial_plus.sql (#4073)
- c2a3ba9 fix: Support rendering rank of 0 (#4083)
- d46995f fix: consistent total slot calculation for cluster overview [DET-7182] (#4080)
- 1b94612 feat: add det.LOG_FORMAT constant (#4090)
- aacd23b feat: add wrap_rank helper script (#4086)
- 8fbf0a7 fix: dont show archived in column picker [DET-7187] (#4085)
- a190a1f chore: bump version: 0.18.0-rc1 -> 0.18.0-rc2
- 88fc38b chore: bump version: 0.18.0-rc0 -> 0.18.0-rc1
- 4d73cea feat: authenticate task proxies (#4071)
- 3e594e4 chore: filter out NaN, +/- Infinity metric values for charts for now. (#4076)
- 1038840 fix: wrap torch.distributed launch in pid server/client (#4077)
- 7df0f00 docs: release note for core api (#4069)
- 7721553 fix: add user column back to experiment list (#4070)
- db5f434 chore: bump version: 0.17.16-dev0 -> 0.18.0-rc0
- 02114cf chore: lock api state for backward compatibility check
- 7be795a docs: Core API reference and cookbook docs (#4054)
- 5e364d1 chore: ancient checkpoints for very old pytorch (#4068)
- 51e80d8 perf: minimize create/destroy of uPlots [DET-6972] [DET-6972] [DET-6796] [DET-6853] [DET-6672] (#3935)
- 531af28 chore: update removed reducer methods (#4067)
- 6f63f13 chore: remove det.pytorch.reset_parameters() (#4066)
- 4fb1886 feat: generic checkpoints and making Core API public (#3859)
- db0f8ef fix: correct the return type for readStream (#4063)
- 33a1a6c chore: remove PBT searcher (#4058)
- a35d49a feat: support for torch native dtrain (#3807)
- 29385bc chore: update k8s scheduler to run latest image (#4061)
- e7f3289 chore: remove remainder of native api (#4055)
- 87008ac chore: deprecate data layer (#4056)
- 75a6bf5 chore: remove deprecated experimental custom reducer methods (#4060)
- 948518c chore: remove unnecessary use of username in webui (DET-6922) (#4049)
- d2cc8d4 refactor: simplify apiConfig to reduce redundancy (#4043)
- 707dd26 chore: Refactor CheckpointModal to be hook based [DET-7136] (#4034)
- ea535b0 fix: make "Show full config" modal larger (#4053)
- 625070e chore: move webui codecov upload to use env var instead of hardcoded token (#4045)
- 0c1821e fix: remove dead shell start code [DET-7131] (#4016)
- 59e5b9b chore: Recommend git clone --recurse-submodules for submodules (#4036)
- ad06cd4 fix: move allocation resources migration to the top.
- a6bf58b refactor: rewrite ndjson streamer [DET-7121] (#4014)
- 6d018f0 fix: Alter Boolean arg default handling (#4038)
- b278ef0 chore: store allocation resources and agent RM containers in DB. (#3946)
- 7afc75f ci: Add build/coverage badge for webui/react [DET-6767] (#4028)
- 628f07c move up profile-pics migration to assure it happens
- 600d77d fix: put profile pic migrations in correct directory
- 3569a6f build: set up shared-web submodule [DET-6961] (#4006)
- 0686bef test: ensure that agent disabling doesn't count for experiment restarts [DET-5916] (#4029)
- b61a83d refactor: remove
NewAllocationID
. (#3959) - b605da8 fix: add another missing key fix (#4033)
- 64e6f7d feat: allow position modification in k8s [DET-6967] [DET-6968] (#3938)
- 24bbec9 feat: Creating a table for user profile pictures
- b831dbd fix: Human-readable option for empty filters in logs [DET-6781, DET-6999] (#4017)
- 5d04a25 fix: Prevent archived models from appearing in the Register Checkpoint modal [DET-7132] (#4015)
- c1523da fix: Ensure table offset does not exceed pagination total [DET-6829] (#4011)
- 84401b2 fix: change timeout in e2e_tests/tests/cluster/test_logging.py/test_trial_logs (#4019)
- ad6f3d5 fix: add key for cancel operation (#4023)
- 9ba0f04 chore: remove deprecated type provider for moment-timezone (#4018)
- be3125b docs: update copyright year (#4005)
- ea4ab88 feat: add product feedback link [DET-5811]
- 6162418 chore: fix documentation comment for /ws/data-layer
- 482ecd6 docs: fix grammar in training-run index (#3988)
- 5480903 docs: fix indent in pytorch-porting-tutorial (#3989)
- 9a6d752 chore: warn about ambiguous enum params (#3997)
- 4de96b9 chore: give the latest-master deploy job a name (#3900)
- 64b4662 fix: pass task time not from logCtx (#3993)
- 59796b0 chore: update node version to active LTS (DET-7046) (#3932)
- 07de426 chore: show appropriate severity level on job launch failures (#4002)
- c68b7a4 test: add option to disable
compare_stats
. (#4008) - 915184c chore: Removing CODEOWNERS entry for release notes (#3978)
- 93b98c2 fix: handle zeros correctly in HpTrialTable metricSorter (#4003)
- cd0e531 chore: bump version: 0.17.15-dev0 -> 0.17.16-dev0
- 7d1493d docs: add release notes for 0.17.15 (#3986)
- 63eb86c feat: add modal to explain why users cannot delete items [DET-6998] (#3994)
- f6fbc05 fix: guard trial and allocation exit logic correctness (#3983)
- 08218c1 fix: do not default to
noverify
forbindings
sessions in CLI. (#3991) - 4c5dad8 chore: add expconf environment.slurm (#3966)
- d2610c5 fix: handle infinite metrics in searcher snapshots [DET-7122] (#3999)
- 57d1b38 chore: handle rank id in log entries (#3995)
- f915a69 fix: handle infinite validation metric values in more cases (#3992)
- 8527e60 fix: Experiment columns filter is still applied after closing [DET-6837] (#3982)
- c7f4610 fix: avoid permanent filtered state in model registry [DET-6946] (#3984)
- 35ac5ff docs: update release note process (#3990)
- d565f84 ci: bump profiling test timeout back up (#3981)
- a43fcff chore: handle errors from starting allocations [DET-5862] (#3975)
- bd54845 feat: add det support bundle [DET-5886] (#3904)
- 899db19 feat: add drag and drop functionality to experiment list column table [DET-7044] (#3956)
- cf5c94f chore: document usage of /ws/data-layer [DET-6685] (#3971)
- 6f6e4a2 feat: Add overall allocation bar to new cluster page [DET-7074] (#3955)
- ff78aa9 chore: add error type for non retry-able resource manager errors (#3947)
- e9d26af chore: apply filtering to task logs (#3963)
- 2a981a1 fix: add test warmup command e2e [DET-5803] (#3965)
- a77b8c1 docs: fix tutorial link swap (#3979)
- 5d65f64 ci: remove old semantic PR app config. (#3980)
- 896725b fix: notebook logs filter by level (#3967)
- f3660c2 chore: remove old notebook logs endpoint (#3960)
- 8a99e03 fix: task log level parsing (#3973)
- 005d144 fix: forward job.DeleteJob through agentRM (#3968)
- 8e21bf0 docs: add release notes for PR #3914 (#3962)
- 28ab2fa chore: Update list of false alarms in Docker image scanning (#3933)
- 80a1dc7 ci: new semantic pull request check. (#3958)
- 189f148 docs: update package versions, add ROCm, edit to style guide (#3950)
- 0d180fe fix: update HPE logo sizes (#3953)
- 393de71 chore: add message to cleanup external RM resources on delete, for slurm (#3902)
Docker images
docker pull determinedai/determined-master:0.18.0
docker pull determinedai/determined-master:3c00fc281
docker pull determinedai/determined-master:3c00fc281542c272c1591c7d1c86eb53db8f230c
docker pull determinedai/determined-dev:determined-master-3c00fc281
docker pull determinedai/determined-dev:determined-master-3c00fc281542c272c1591c7d1c86eb53db8f230c
docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.18.0
docker pull nvcr.io/isv-ngc-partner/determined/determined-master:3c00fc281
docker pull nvcr.io/isv-ngc-partner/determined/determined-master:3c00fc281542c272c1591c7d1c86eb53db8f230c