0.26.4
Release Notes
Changelog
- bf665ae chore: bump version: 0.26.4-rc4 -> 0.26.4
- 2f86950 docs: add release notes for 0.26.4 (#8451)
- f2ef0fe chore: bump version: 0.26.4-rc3 -> 0.26.4-rc4
- f0a37a9 fix: Calculate allocation bar stats same as overview [WEB-1822] (#8431)
- 9acfbf2 chore: bump version: 0.26.4-rc2 -> 0.26.4-rc3
- 9dd0211 fix: k8s autoscaling nodes not counted towards RP (#8439)
- 3bf6647 chore: bump version: 0.26.4-rc1 -> 0.26.4-rc2
- 47397a4 fix: new experiment list tooltip styling (#8433)
- 680ac02 ci: fix linting with responses==0.24.1 (#8436)
- d4200d2 chore: add version dropdown url for previous release (#8437)
- 0a4b6bc test: fix model registry rbac wrong user regression (#8420)
- 242ff97 fix: Wrap older modals in theme class [WEB-1824] (#8432)
- e3c109a chore: bump version: 0.26.4-rc0 -> 0.26.4-rc1
- 22e18ae fix: replace antd select with hew select (#8424)
- 00d349a feat: add workspace/project creation/deletion (#8430)
- 9f727fe feat: client gets list_models, too. (#8425)
- b8c1be7 chore: update Column and Row from Hew (#8412)
- 4e6fd52 chore: bump version: 0.26.4-dev0 -> 0.26.4-rc0
- e9a457d chore: lock published urls to preserve redirects
- 2fae9ba chore: add docs dropdown link for new version
- 6c3bf84 chore: make insert-dropdown-url.sh executable (#8418)
- b5ca7f4 chore: fail deployment if launching part of the service fails (#8409)
- 8498674 fix: allow --json in det master config CLI command (#8413)
- d123932 fix: Place modal inside of ResourcePoolCard (#8414)
- ff19924 chore: Add eslint rule for ?? operator (#8410)
- d56b3ae chore: convert DOS line endings to Unix (#8411)
- c1219eb fix: Hide stats card when 0 on cluster page (#8359)
- da77efb fix: added permission check on GetAllocation (#8281)
- 3b0550c chore: Bumpenvs 0.26.4 (#8407)
- e48d03d fix: user flag to prompt for password during user requests (#8158)
- 513e6d7 fix: Project and Workspace cards wrap modal divs (#8378)
- 2497d84 chore: export AddUserTx (#8403)
- ad764f0 refactor: implement Glossary component from Hew (#8385)
- 52326d1 feat: change cli command for patch master log config DET[9720] (#8054)
- 1e9155d chore(type): stricter tsconfig (#8349)
- 16f18cc chore: revert task obfuscation lint failures (#8406)
- dde3156 chore: Implement Theming updates in Determined [WEB-1726] (#8388)
- 4edfc3c ci: move packaging test to test-e2e-longrunning (#8381)
- d3c208a ci: cache go modules deps and build cache (#8383)
- 8924996 chore: temporarily disable CI upload job (#8399)
- 356f651 Revert "chore: temporarily disable upload_test_results job step"
- 6dd9701 chore: temporarily disable upload_test_results job step
- ba49dbd ci: up parallelism for slowest test_e2e premerge tests (#8374)
- 5f3e556 ci: finish removing growforest (#8389)
- 62084e2 fix: NTSC task and slot viewing obscured for RBAC users with no Viewer Permissions (#8311)
- 0254f7d chore: fix nil ptr on allocation.Proto() (#8372)
- 119e759 chore: fix profiler test in CI (#8382)
- b428d5e feat: add hide column header menu item to explist (#8342)
- 7ae0501 chore: update the lore service port (#8375)
- 052cf8d feat: Cluster historical usage charts move to UI Kit LineChart [WEB-1786] [WEB-1764] (#8327)
- 819948d feat: clear filter from experiment table header (#8376)
- a590999 test: fix slow delete_checkpoint test (#8377)
- b0505db chore: Job/task displays Running instead of Scheduled (#8335)
- 1d64941 chore: short dsat e2e tests (#8288)
- 6afa836 chore: fix CI mnist_pytorch (#8364)
- 4d3eaab chore: Update Horovod Cycle Time (#8362)
- d3b01cb docs: Add det pach tutorial (#8082)
- 7cebc30 fix: adjust card size on workspaces page (#8370)
- 5c93cb0 chore: enable more Go linters (#8333)
- a279967 fix: aws deployment can deploy priority scheduler (#8345)
- 3d9293c fix: fixed bug in error handling in experiment.go (#8339)
- 194bfd5 fix: Cell can be undefined in experiment list table (#8360)
- 1da92aa chore: bump environment images to ubuntu 18.04 [MLG-1194] (#8356)
- 990c56f chore: add list_experiments to experimental.client (#8361)
- 3a7d9ea fix(tests): lower e2e_gpu_quarantine parallelism (#8363)
- 4c48458 fix: patched remote users were able to login with password (#8337)
- baf5c96 chore: port over PyTorch example to use Trainer API [MLG-1181] (#8292)
- 235bd8f feat: delete TB files from the SDK (#8329)
- 2fe3d99 chore: update Typography from UI kit (#8323)
- 2b23674 fix: prevent carriage return in env from crashing deepspeed launcher (#8321)
- 461c307 chore: Remove DesignKit since it's now maintained in Hew [WEB-1790] (#8338)
- 5ee87ec fix: Set group name and number columns to handle Safari [DET-9948] [DET-9949] (#8355)
- 10deef9 fix(experiments): transient errors shouldn't leave trial hung (#8352)
- 512b9f3 chore: remove accidental mock commit (#8354)
- 9d17dbf feat: Show "-" for null values in data cells for experiment list (#8343)
- ea50987 fix: properly interpret flag values (#8326)
- 8b6fc68 fix: Allow SAML and OIDC logins to work differently [WEB-1797] (#8308)
- 274288e docs: fix linting failure (#8351)
- 73bf0e8 docs: log policies (#8302)
- 8418029 chore: ft slot capacity check for each trial [DET-9897] (#8213)
- 494ca57 fix: replace TODO with ctx for deleteTensorboard (#8332)
- cfde2f6 docs: Docs Version Dropdown Automation (#8340)
- 8e69941 chore: Remove examples/legacy (#8153)
- af995ba fix: cli is not a library! (#7891)
- bf0a03d test: fix
ray.air.session
import. (#8344) - 9bb10cc ci: mypy fix for responses>=0.24.0 (#8341)
- b924b25 fix: add pin icon in dropdown (#8324)
- 62b7f3b chore: remove fit-content from
TimeAgoc
(#8328) - 86d6962 chore: update determined-ui to hew (#8334)
- f580385 fix: metric group charts have more than one color (#8304)
- 1966373 feat: Add tensorboard delete command to CLI (#8227)
- 656c8b2 chore: bump version: 0.26.3-dev0 -> 0.26.4-dev0
- af43248 docs: add release notes for 0.26.3 (#8322)
- b262a3d chore: Update lore.yaml to use the new version
- d64a0ac chore: use a single .golangci.yml file (#8320)
- ad94d20 chore: Add progress bar from UI Kit [WEB-1675] (#8181)
- b3b5be0 feat: implement CodeSample from UI Kit [WEB-1677] (#8270)
- d723b7f docs: fix typo in user edit release note (#8319)
- 6e5d840 chore: initial experiment actor refactor (#8229)
- 8a1ff58 chore: use a single root level go mod (#8285)
- 3511abf chore: delete dead code (#8313)
- d0e6375 chore: add a new deployment type for aws (#8279)
- 3929e8c chore(actors): remove ctx usage in agent_state.go (#8267)
- 5bf1b87 ci: delete broken wait_for helper (#8312)
- 50535f1 test: quarantine GPU execution of test_task_logs (#8261)
- d5b8e80 chore: deployment's --dry-run option doesn't print template (#8303)
- ac89d44 fix: allow experiments with directory checkpoint storage to parse (#8310)
- 306c0c3 fix: Project info not presists when forking (#8307)
- dc1b131 chore: sort out issues after bringing EE e2e_tests into OSS (#8084)
- d182abe chore: slurm support for blocklist (#1111)
- efdf62b fix: return correct location URL for /Users SCIM API endpoint (#1115)
- 37a84d1 fix: ruamel.yaml fixes for EE
- 0ce925a chore: Update nightly tests that use legacy cifar10_pytorch (#1102)
- 6cad296 fix: update for error message change in product (#1098)
- 1e302b5 chore: update e2e tests affected by examples_pruning (#1100)
- ad3dcda chore: cleanup model registry rbac test
- a3ffb5d test: enable command run tests for PBS (#1073)
- 9dd0e42 test: enable command and deepspeed tests run on slurm/pbs (#1044)
- ea4f4c4 chore(templates): ee fixes for template rbac
- c48e48d fix: Test test_slurm_verify_home fails with podman and it shouldn't [FE-136] (#1028)
- 760a738 test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)
- b5aee79 chore: use longer running no op experiment when seeding workspace (#994)
- facbda9 test: run test_hpc_job_pending_reason only on gcp vm (#996)
- 393d0b5 ci: FE-133 Configure non agent slurm/pbs tests to skip without explicitly listing test names in circleci. (#977)
- fd15535 ci: add ee-only files to the import-restrictions linter exclusions.
- 9cf2a26 test: slurm/pbs test for pending reason (FE-90) (#960)
- b3c2ca3 chore(actors): allocation.go, ee side
- eb7d1a1 test: [ALLGCP] Add e2e test for HPC that verifies that user HOME is preserved (#972)
- f3a8b0e test: fix test_slurm.py lint error (#949)
- 71896f3 chore: FE-91: Update base images (slurm/pbs) to include a populated singularity_image_cache (#943)
- 5891567 feat: add rbac to
api/v1/master/config
[DET-9633] (#931) - 0a5c32e ci: FE-72: Add test-e2e-pbs-*-gcp tests (#941)
- 4c233c0 feat: add rbac for strict job queue control (#927)
- 6e23aa2 chore: removed admin dependency from delete model/version (#912)
- 7c6c59e feat: rbac for templates (#909)
- e89cc08 ci: DET 9622: (ee) test_slurm.py::test_cifar10_pytorch_distributed failures (#919)
- c6ee094 fix: test_rbac goes to wrong url (#918)
- 6b34e0d fix: DET-9483 successfully run e2e_slurm_preemption tests as part of nightly workflow (#903)
- 4f6277d ci: FE-14 Migrate test-e2e-slurm to GCP slurmcluster (#879)
- f4507f7 tests: fix a miss indentation leading to missing project err (#878)
- 5cad9bb chore: fix a missing check for global permissions in jq (#874)
- bca3848 feat: add rbac support for reading job queue (#871)
- 5c79474 chore: update how we wait for tasks to be ready (#863)
- b292862 test: fix
test_master_host
[DET-9482]. (#851) - 5375a08 ci: quarantine flaky slurm tests (#850)
- 8e44c6c fix: Patch groups test [DET-9473] (#845)
- 49d2e08 fix: fix bug with launching tensorboards on trials (#842)
- d4dcbe5 test: Fix and add e2e_slurm_preemption tests to nightly workflow [DET-9237] (#806)
- 1529b9c style: update py binding references for ee
- 06f5554 feat: implement rbac for master logs and cluster usage (#745)
- a399707 chore: fix api usage after oss update.
- e47c22a ci: checkpoint loading is for unit tests (#754)
- 7b13cc7 feat: add rbac agents/slots enable/disable [DET-9156] (#751)
- 0546ae9 fix: rbac e2e test (#738)
- e44f464 chore: add e2e test for preemption on hpc cluster (FOUNDENG-497) (#726)
- 81b7bf3 fix: Fail attempts to mount under /run/determined on HPC [FOUNDENG-482] (#710)
- 483705c chore: use ported test code (#701)
- e588dba fix: user can only list models with correct permissions + small fixes in workspace filtering in get models (#681)
- c55777d chore: fix ntsc fix authz order tb [DET-8885] (#667)
- f111780 fix: websocket upgrade failed in tensorboard [DET-8903] (#672)
- 7adb55b fix: tensorboard list not showing tensorboards [DET-8904] (#669)
- d4ce809 fix: ntsc endpoints should return 404 on unauthorized workspace id [det-8911] (#671)
- ecd3d18 test: add dtrain test to e2e_slurm (#664)
- 72c9f93 chore: repair tensorboard e2e tests (#666)
- af77d76 feat: rbac ntsc ee (#662)
- 9a7c8b9 feat: RBAC Model Registry EE features [DET-8704] (#644)
- 42ef7d0 chore: integration tests to verify that slurm jobs are restored on master restart (FOUNDENG-216) (#600)
- 8ce0fbc chore: Fix lint errors in Slurm integration tests (#647)
- 7cb0cb3 chore: rbac ntsc supporting changes (#641)
- 44257db fix: FOUNDENG-336 test_noop_hpc continues to fail periodically (#632)
- cc0b450 chore: Add test-e2e-slurm-podman (#543)
- bf7be54 fix: Improve test_noop_hpc reliability [FOUNDENG-361] (#590)
- bc8c46f fix: Reduce verbosity of failure messages [FOUNDENG-370] (#583)
- 89e296a fix: stop printing incorrect (exit code 1) for failed command. (#588)
- 037ff4d test: adding tests for Oauth in EE (#582)
- ee60ddf chore: Add e2e slurm test for env var quoting [FOUNDENG-366] (#579)
- aebd7ed chore: Add test-e2e-slurm suite using enroot containers (#574)
- fe07d9a fix: FOUNDENG-310 test_noop_pause_hpc needs timeout increase to avoid random failures (#539)
- ab5fafe ci: Addition of znode runners (#417)
- b00a7f3 fix: FOUNDENG-303 Pausing, then resuming an experiment fails (#533)
- 80fabf0 test: disable restart on expected failure case. (#528)
- cd286c9 chore: Fix test_node_not_available test [FOUNDENG-304] (#517)
- 9f27bed test: update expected error messages. (#526)
- 71bd2c6 chore: Disable test_node_not_available [FOUNDENG-304] (#512)
- 51d4427 chore: Disable test_node_not_available [FOUNDENG-304] (#510)
- 109cf24 chore: consume experiment PBS & Slurm batch args (#472)
- b4a6223 chore: generalize message for Slurm/PBS. (#463)
- 21ffa89 test: enable test_launch for e2e_slurm. (#389)
- d0e8021 test: Enable test case on slurm (FOUNDENG-171) (#385)
- a8cbeca test: Enable logging tests on slurm (#367)
- 4a29f69 fix: e2e_test test_slurm.py test_node_not_available fails on CPU based cluster (Mosaic) due to different Error output (FOUNDENG-132) (#364)
- b71c672 ci: slurm ci (#342)
- 4fc1908 chore: Provide Slurm job submission failure test cases (FOUNDENG-86) (#321)
- 564a714 feat: add support for SCIM provisioning
- 143ff3c fix: adjust width size in group table (#8309)
- d8eb61b chore: fix job service panic when workspace does not exist (#8306)
- bf177da feat: new user management filters (#8002)
- 30c7681 chore: stop using root logger (#8294)
- 9d4c8e4 fix: check externalConfig is enabled before setting det_jwt as auth header (#8298)
- 79a324b fix: undefined handling in
CreateGroupModal
(#8301) - d265530 feat: added a new cli command to recover hp search experiments (#8149)
- c5efc30 docs: quick fix for version dropdown (#8300)
- 5a8a912 fix: prevent settings store from triggering rerenders on poll (#8295)
- 7772fb3 ci: update wrapper config to run on tags (#8296)