Releases: determined-ai/determined
Releases · determined-ai/determined
0.26.5
Release Notes
Changelog
- cfa7730 chore: bump version: 0.26.5-rc3 -> 0.26.5
- 2755e5e docs: add release notes for 0.26.5 (#8564)
- 617fb0c chore: bump version: 0.26.5-rc2 -> 0.26.5-rc3
- 5eff3b4 chore: bump version: 0.26.5-rc1 -> 0.26.5-rc2
- 54aa7a8 feat: add the slots property to the props (#8498)
- 52adafb fix: allow slots per trial to be 0 [WEB-1871] (#8521)
- 6eb6bf0 fix(api): delete experiment error handling corrections (#8510)
- 54f2cec fix: trial spinner (#8528)
- 5cfc4cc chore: bump version: 0.26.5-rc0 -> 0.26.5-rc1
- 5f4a7f4 fix: ensure filter columns are valid when selecting special columns (#8517)
- 189b4c1 fix(agentrm): resource pools must filter agents.list() by name (#8509)
- f47bde5 ci: add date to python cache, fix moto linting issue (#8493)
- 22d6d25 chore: bump version: 0.26.5-dev0 -> 0.26.5-rc0
- 275ea84 chore: lock api state for backward compatibility check
- cea9ff4 fix: Experiment state now is an ExperimentState (#8457)
- 06b7b79 fix: tqdm logs within wrap_rank [MLG-1236] (#8488)
- f50f7db fix: use explicit e.state in bulk experiment delete query (#8491)
- 0f65698 fix(rm): tasks shouldn't hang on restore failures (#8486)
- f847b26 chore: Revert "test: quarantine GPU execution of test_task_logs (#8261)" (#8484)
- f085e10 fix: failing custom searcher due to ExtraEnvVars being overwritten (#8490)
- 88f64f6 chore: update libraries (#8463)
- 2e48a6f chore: migrate detaileduser and experimentitem types to io-ts (#8477)
- d8ed945 chore: add aliases to det dev commads (#8156)
- a2730cb feat: adding PACHD_ADDRESS and DEX_TOKEN to task env (#8473)
- 837bc29 chore: Update Hew Version to 0.6.12 (#8481)
- 1ee9b81 chore: clean up version dropdown update script (#8415)
- 398f879 docs: Add requirement and known issue for singularity-suid (#8478)
- 64e299e fix: wrong skip experiment config regex for log policies (#8475)
- 0b4e1d2 chore: cleanup some spurious cluster logs (#8468)
- 8c9dfbf fix: add delete cascade to generic metrics (#8469)
- 0b28148 ci: register unit pytest marks (#8470)
- 815f5ae fix: Kill task permission on interactive page (#8358)
- e67807d chore: preserve CI logs when bringing an AWS cluster down (#8461)
- 50d40bd chore: update trial complete or early exit to always notify searcher (#8466)
- 6bdf061 Update k8s install info (#8465)
- e06b472 chore: export AddUserTx (#8458)
- 16ea5a0 chore: introduce and use observables with improved update checking (#8405)
- 5805df2 chore: set up ownership for .circleci [skip ci] (#8402)
- c2a211d fix(api): handle delete experiment failures correctly (#8459)
- dfb4dc5 chore(actors): remove pkg/actor (#8452)
- 167e237 chore: add error check to KillNTSC (#8441)
- 4edbc7f chore: log RestoreAllCommands error (#8454)
- 9d71abd Fix minor issues including hard coded reference (#8427)
- 756a79c chore: bump CI node version to 20.9.0 (#8455)
- 3090e42 chore: use ResourcePool info for consistent capacity calculation [WEB-1796] (#8447)
- 192a2b3 chore(actors): remove pkg/actor usage from agentrm (#8395)
- 5ded38d chore: bump version: 0.26.4-dev0 -> 0.26.5-dev0
- 4339f67 docs: add release notes for 0.26.4 (#8451)
- a25e4f5 fix: add back pin icon in experiment list header (#8429)
- adb5191 ci: store npm log artifacts (#8449)
- e29006b fix: det slot task name for no-permissions RBAC users (#8416)
- 695a648 fix: SDK list_checkpoints not defaulting to searcher metric sort (#8448)
- 1ddad7d feat: add Topology into the RP details page (#8276)
- 2f7dda6 ci: cache install Python (#8426)
- cde18df fix: Calculate allocation bar stats same as overview [WEB-1822] (#8431)
- 9a48ff1 docs: Update upgrade instructions (#8346)
- e9a199a fix: k8s autoscaling nodes not counted towards RP (#8439)
- bd19e7a chore: command actor refactor & add intg test [DET-9660] (#8136)
- 99c4cea test: create a test for delete-tensorboards via
det e delete
(#8336) - 7b7d1eb feat: Add remote user settings to Users table [WEB-1798] (#8397)
- 8051039 ci: fix linting with responses==0.24.1 (#8436)
- b087c10 chore: add version dropdown url for previous release (#8437)
- 8e9c505 test: fix model registry rbac wrong user regression (#8420)
- 292c75d fix: new experiment list tooltip styling (#8433)
- 1b47bf0 ci: delete broken fixture (#8428)
- bf07e61 fix: Wrap older modals in theme class [WEB-1824] (#8432)
- 0c8fad9 chore: filterformstore comment re: change tracking (#8386)
- 1041e56 fix: replace antd select with hew select (#8424)
- aa34aa7 feat: add workspace/project creation/deletion (#8430)
- 2e0a5a2 feat: client gets list_models, too. (#8425)
- 69df80e Revert "feat: Client gets list_models, too."
- d1343ca feat: Client gets list_models, too.
- a1e660d chore: converting SearchGroupsWithoutPersonalGroups into tx (#8419)
- 6ba688d test: fix TestAddAndRemoveBindings flake (#8423)
- d4c4195 chore: update Column and Row from Hew (#8412)
- d1d09e7 docs: Update non root container instructions (#8273)
0.26.4
Release Notes
Changelog
- bf665ae chore: bump version: 0.26.4-rc4 -> 0.26.4
- 2f86950 docs: add release notes for 0.26.4 (#8451)
- f2ef0fe chore: bump version: 0.26.4-rc3 -> 0.26.4-rc4
- f0a37a9 fix: Calculate allocation bar stats same as overview [WEB-1822] (#8431)
- 9acfbf2 chore: bump version: 0.26.4-rc2 -> 0.26.4-rc3
- 9dd0211 fix: k8s autoscaling nodes not counted towards RP (#8439)
- 3bf6647 chore: bump version: 0.26.4-rc1 -> 0.26.4-rc2
- 47397a4 fix: new experiment list tooltip styling (#8433)
- 680ac02 ci: fix linting with responses==0.24.1 (#8436)
- d4200d2 chore: add version dropdown url for previous release (#8437)
- 0a4b6bc test: fix model registry rbac wrong user regression (#8420)
- 242ff97 fix: Wrap older modals in theme class [WEB-1824] (#8432)
- e3c109a chore: bump version: 0.26.4-rc0 -> 0.26.4-rc1
- 22e18ae fix: replace antd select with hew select (#8424)
- 00d349a feat: add workspace/project creation/deletion (#8430)
- 9f727fe feat: client gets list_models, too. (#8425)
- b8c1be7 chore: update Column and Row from Hew (#8412)
- 4e6fd52 chore: bump version: 0.26.4-dev0 -> 0.26.4-rc0
- e9a457d chore: lock published urls to preserve redirects
- 2fae9ba chore: add docs dropdown link for new version
- 6c3bf84 chore: make insert-dropdown-url.sh executable (#8418)
- b5ca7f4 chore: fail deployment if launching part of the service fails (#8409)
- 8498674 fix: allow --json in det master config CLI command (#8413)
- d123932 fix: Place modal inside of ResourcePoolCard (#8414)
- ff19924 chore: Add eslint rule for ?? operator (#8410)
- d56b3ae chore: convert DOS line endings to Unix (#8411)
- c1219eb fix: Hide stats card when 0 on cluster page (#8359)
- da77efb fix: added permission check on GetAllocation (#8281)
- 3b0550c chore: Bumpenvs 0.26.4 (#8407)
- e48d03d fix: user flag to prompt for password during user requests (#8158)
- 513e6d7 fix: Project and Workspace cards wrap modal divs (#8378)
- 2497d84 chore: export AddUserTx (#8403)
- ad764f0 refactor: implement Glossary component from Hew (#8385)
- 52326d1 feat: change cli command for patch master log config DET[9720] (#8054)
- 1e9155d chore(type): stricter tsconfig (#8349)
- 16f18cc chore: revert task obfuscation lint failures (#8406)
- dde3156 chore: Implement Theming updates in Determined [WEB-1726] (#8388)
- 4edfc3c ci: move packaging test to test-e2e-longrunning (#8381)
- d3c208a ci: cache go modules deps and build cache (#8383)
- 8924996 chore: temporarily disable CI upload job (#8399)
- 356f651 Revert "chore: temporarily disable upload_test_results job step"
- 6dd9701 chore: temporarily disable upload_test_results job step
- ba49dbd ci: up parallelism for slowest test_e2e premerge tests (#8374)
- 5f3e556 ci: finish removing growforest (#8389)
- 62084e2 fix: NTSC task and slot viewing obscured for RBAC users with no Viewer Permissions (#8311)
- 0254f7d chore: fix nil ptr on allocation.Proto() (#8372)
- 119e759 chore: fix profiler test in CI (#8382)
- b428d5e feat: add hide column header menu item to explist (#8342)
- 7ae0501 chore: update the lore service port (#8375)
- 052cf8d feat: Cluster historical usage charts move to UI Kit LineChart [WEB-1786] [WEB-1764] (#8327)
- 819948d feat: clear filter from experiment table header (#8376)
- a590999 test: fix slow delete_checkpoint test (#8377)
- b0505db chore: Job/task displays Running instead of Scheduled (#8335)
- 1d64941 chore: short dsat e2e tests (#8288)
- 6afa836 chore: fix CI mnist_pytorch (#8364)
- 4d3eaab chore: Update Horovod Cycle Time (#8362)
- d3b01cb docs: Add det pach tutorial (#8082)
- 7cebc30 fix: adjust card size on workspaces page (#8370)
- 5c93cb0 chore: enable more Go linters (#8333)
- a279967 fix: aws deployment can deploy priority scheduler (#8345)
- 3d9293c fix: fixed bug in error handling in experiment.go (#8339)
- 194bfd5 fix: Cell can be undefined in experiment list table (#8360)
- 1da92aa chore: bump environment images to ubuntu 18.04 [MLG-1194] (#8356)
- 990c56f chore: add list_experiments to experimental.client (#8361)
- 3a7d9ea fix(tests): lower e2e_gpu_quarantine parallelism (#8363)
- 4c48458 fix: patched remote users were able to login with password (#8337)
- baf5c96 chore: port over PyTorch example to use Trainer API [MLG-1181] (#8292)
- 235bd8f feat: delete TB files from the SDK (#8329)
- 2fe3d99 chore: update Typography from UI kit (#8323)
- 2b23674 fix: prevent carriage return in env from crashing deepspeed launcher (#8321)
- 461c307 chore: Remove DesignKit since it's now maintained in Hew [WEB-1790] (#8338)
- 5ee87ec fix: Set group name and number columns to handle Safari [DET-9948] [DET-9949] (#8355)
- 10deef9 fix(experiments): transient errors shouldn't leave trial hung (#8352)
- 512b9f3 chore: remove accidental mock commit (#8354)
- 9d17dbf feat: Show "-" for null values in data cells for experiment list (#8343)
- ea50987 fix: properly interpret flag values (#8326)
- 8b6fc68 fix: Allow SAML and OIDC logins to work differently [WEB-1797] (#8308)
- 274288e docs: fix linting failure (#8351)
- 73bf0e8 docs: log policies (#8302)
- 8418029 chore: ft slot capacity check for each trial [DET-9897] (#8213)
- 494ca57 fix: replace TODO with ctx for deleteTensorboard (#8332)
- cfde2f6 docs: Docs Version Dropdown Automation (#8340)
- 8e69941 chore: Remove examples/legacy (#8153)
- af995ba fix: cli is not a library! (#7891)
- bf0a03d test: fix
ray.air.session
import. (#8344) - 9bb10cc ci: mypy fix for responses>=0.24.0 (#8341)
- b924b25 fix: add pin icon in dropdown (#8324)
- 62b7f3b chore: remove fit-content from
TimeAgoc
(#8328) - 86d6962 chore: update determined-ui to hew (#8334)
- f580385 fix: metric group charts have more than one color (#8304)
- 1966373 feat: Add tensorboard delete command to CLI (#8227)
- 656c8b2 chore: bump version: 0.26.3-dev0 -> 0.26.4-dev0
- af43248 docs: add release notes for 0.26.3 (#8322)
- b262a3d chore: Update lore.yaml to use the new version
- d64a0ac chore: use a single .golangci.yml file (#8320)
- ad94d20 chore: Add progress bar from UI Kit [WEB-1675] (#8181)
- b3b5be0 feat: implement CodeSample from UI Kit [WEB-1677] (#8270)
- d723b7f docs: fix typo in user edit release note (#8319)
- 6e5d840 chore: initial experiment actor refactor (#8229)
- 8a1ff58 chore: use a single root level go mod (#8285)
- 3511abf chore: delete dead code (#8313)
- d0e6375 chore: add a new deployment type for aws (#8279)
- 3929e8c chore(actors): remove ctx usage in agent_state.go (#8267)
- 5bf1b87 ci: delete broken wait_for helper (#8312)
- 50535f1 test: quarantine GPU execution of test_task_logs (#8261)
- d5b8e80 chore: deployment's --dry-run option doesn't print template (#8303)
- ac89d44 fix: allow experiments with directory checkpoint storage to parse (#8310)
- 306c0c3 fix: Project info not presists when forking (#8307)
- dc1b131 chore: sort out issues after bringing EE e2e_tests into OSS (#8084)
- d182abe chore: slurm support for blocklist (#1111)
- efdf62b fix: return correct location URL for /Users SCIM API endpoint (#1115)
- 37a84d1 fix: ruamel.yaml fixes for EE
- 0ce925a chore: Update nightly tests that use legacy cifar10_pytorch (#1102)
- 6cad296 fix: update for error message change in product (#1098)
- 1e302b5 chore: update e2e tests affected by examples_pruning (#1100)
- ad3dcda chore: cleanup model registry rbac test
- a3ffb5d test: enable command run tests for PBS (#1073)
- 9dd0e42 test: enable command and deepspeed tests run on slurm/pbs (#1044)
- ea4f4c4 chore(templates): ee fixes for template rbac
- c48e48d fix: Test test_slurm_verify_home fails with podman and it shouldn't [FE-136] (#1028)
- 760a738 test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)
- b5aee79 chore: use longer running no op experiment when seeding workspace (#994)
- facbda9 test: run test_hpc_job_pending_reason only on gcp vm (#996)
- 393d0b5 ci: FE-133 Configure non agent slurm/pbs tests to skip without explicitly listing test names in circleci. (#977)
- fd15535 ci: add ee-only files to the import-restrictions linter exclusions.
- 9cf2a26 test: slurm/pbs test for pending reason (FE-90) (#960)
- b3c2ca3 chore(actors): allocation.go, ee side
- eb7d1a1 test: [ALLGCP] Add e2e test for HPC that verifies that user HOME is preserved (#972)
- f3a8b0e test: fix test_slurm.py lint error (#949)
- 71896f3 chore: FE-91: Update base images (slurm/pbs) to include a populated singularity_image_cache (#943)
- 5891567 feat: add rbac to
api/v1/master/config
[DET-9633] (#931) - 0a5c32e ci: FE-72: Add test-e2e-pbs-*-gcp tests (#941)
- 4c233c0 feat: add rbac for strict job queue control (#927)
- 6e23aa2 chore: removed admin dependency from delete model/version (#912)
- 7c6c59e feat: rbac for templates (#909)
- e89cc08 ci: DET 9622: (ee) test_slurm.py::test_cifar10_pytorch_distributed failures (#919)
- c6ee094 fix: test_rbac goes to wrong url (#918)
- 6b34e0d fix: DET-9483 successfully run e2e_slurm_preemption tests as part of nightly workflow (#903)
- 4f6277d ci: FE-14 Migrate test-e2e-slurm to GCP slurmcluster (#879)
- f4507f7 tests: fix a miss indentation leading to missing project err (#878)
- 5cad9bb chore: fix a missing check for global permissions in jq (#874)
- bca3848 feat: add rbac support for reading job queue (#871)
- 5c79474 chore: update how we wait for tasks to be ready (#863)
- b292862 test: fix
test_master_host
[DET-9482]. (#851) - 5375a08 ci: quarantine flaky slurm tests (#850)
- 8e44c6c fix: Patch groups test [DET-9473] (#845)
- 49d2e08 fix: fix bug with launching tensorboards on trials (#842)
- d4dcbe5 test: Fix and add e2e_slurm_preemption tests to nightly workflow [D...
0.26.3
Release Notes
Changelog
- bd74446 chore: bump version: 0.26.3-rc3 -> 0.26.3
- 162de31 docs: add release notes for 0.26.3 (#8322)
- a472745 chore: bump version: 0.26.3-rc2 -> 0.26.3-rc3
- bab1dad fix: allow experiments with directory checkpoint storage to parse (#8310)
- ec438f2 fix: adjust width size in group table (#8309)
- 0d49b53 chore: fix job service panic when workspace does not exist (#8306)
- 13525e8 fix: check externalConfig is enabled before setting det_jwt as auth header (#8298)
- 101f279 fix: undefined handling in
CreateGroupModal
(#8301) - b9e64ab docs: quick fix for version dropdown (#8300)
- 48e915f chore: bump version: 0.26.3-rc1 -> 0.26.3-rc2
- 3c2cb53 ci: update wrapper config to always run
- 55c8a3a chore: bump version: 0.26.3-rc0 -> 0.26.3-rc1
- c138065 chore: bump version: 0.26.3-dev0 -> 0.26.3-rc0
- 862b41e chore: bump version: 0.26.2-dev0 -> 0.26.3-dev0
- b2e4b02 docs: add release notes for 0.26.2 (#8245)
- 989a0e3 fix: update bumpversion cfg for new CircleCI config (#8293)
- 54a12b8 fix: fix docs linting (#8291)
- 108cca7 docs: add documentation for Keras and PyTorch profilers [MLG-1094] (#8253)
- a7dfd47 chore: lock published urls to preserve redirects
- 8ae316c chore: lock api state for backward compatibility check
- d0c8273 chore: refactor filterformstore (#8239)
- 2992b82 feat: Updated LineChart in UI Kit [WEB-1700] (#8105)
- c6c3235 fix: login error message (#8240)
- 949ecf9 feat: support PyTorch Profiler in DeepSpeed trials [MLG-1095] (#8251)
- b2ea839 fix: use proper default cpu env image in the helm chart. (#8287)
- ccc78dd chore: install request as dev dep to fix proxy.js (#8286)
- af40f2c fix: only train for one batch in PyTorch Trainer test mode (#8260)
- cbab7e3 fix: dsat with all yaml formats (#8284)
- d17a2cc fix: migrate CLI from deprecated SDK methods (#8282)
- f43acd8 feat: directory checkpoint storage [DET-9594] (#8255)
- f375fbb feat: log policies (#8145)
- 862951f chore(actors): remove slots, slot proxy hacks from agentrm (#8266)
- 8d79c17 feat: webhook type task logs (#8175)
- b086982 test: fix delete experiments potential flake (#8283)
- fe06a0a fix: update copy for Agent UID/GID modal in the user mgmt UI. (#8278)
- 1fbb9f0 chore: restore set resource pool using job service (#8280)
- 0bea20b chore(actors): remove actors from resource aggregation (#8265)
- 76b4bfe feat: update add members to group (#8262)
- f24748c docs: Fix a 404 error (#8277)
- e4701b2 fix: remove Topology from the ResourcePoolDetail page (#8274)
- bec5edd chore(actors): refactor k8s rm without actors (#8264) [DET-9658]
- 46b9360 fix: icon size in TaskBar (#8263)
- 5c12463 test: don't commit Go mocks (#8258)
- a33d2b8 chore: ci should always setup python venv for caching (#8257)
- c09c529 ci: wart removal (#8147)
- 65219ca docs: fix inaccuracies in
bind_mounts
docs. (#8254) - d3bd2cb feat: unify new/edit group modals [WEB-1741] (#8236)
- 3179505 build(deps): bump actions/setup-node from 3 to 4 (#8230)
- e0a7efc fix: support ruamel.yaml>=0.18.0 (#8237)
- 2365df6 docs: check for dropped urls in PRs (#8247)
- f9c6402 fix: deepspeed e2e_tests fail when environment variable contains a newline character [FE-256] (#8154)
- ec7e004 fix: prevent passing login r= param to relayState (#8244)
- ab4cf83 feat: Update "Add Members" to Workspace experience (#8195)
- 25adf7d fix: dont suggest registering or deleting a checkpoint that is already deleted (#8246)
- c52ef71 feat: add new CLI command to edit multiple fields at once (#8075)
- 1d86343 chore: move files from /Clusters into /Cluster [WEB-1730] (#8231)
- 15ee78f docs: Add article for using detached mode (#8217)
- 115d7c2 fix: GetTasks doesn't respect rbac (#8233)
- 47b37eb chore: update Avatar component in UI kit [WEB-1055, WEB-1734] (#8178)
- 4c59393 chore: indirect import for jobs to avoid import cycle (#8238)
- aa77386 chore: add back missing commit in Python SDK (#8206)
- 9a80d99 ci: alternate mechanism for running nightly tests (#8221)
- 27a279b refactor: use Nameplate component in NavigationSidebar [WEB-1057] (#8152)
- c0e3062 revert: fix: support ruamel.yaml==0.18.0 (#8235)
- ae839a8 fix: support ruamel.yaml==0.18.0 (#8228)
- bb7020a fix: inaccurate task queue time [DET-9912] (#8225)
- 12e23b0 fix: pin ruamel.yaml<0.18.0 (#8232)
- 164b920 feat: Update users and groups tab to reflect count (#8224)
- 2ac9661 chore: shorten distributed-quarantine name (#8234)
- c964987 docs: Apply minor edits (#8215)
- f61d250 chore: support redirect on auth failure for echo routes (#8196)
- f251add docs: update the status of TLS security for notebooks. (#8191)
- 54551c5 refactor:
CliError
does not neede_stack
property. (#8179) - 7c6b2d0 chore: deprecate
apex
support. (#7526) - bfda78d docs: quick fix for version dropdown (#8223)
- 5929161 docs: Fix broken Slack links (#8220)
- b076179 chore: refactor actor system out of internal/job (#8174)
- facdc30 fix: remove 'contents' parameter from remove_notes (#8209)
- 448162d docs: Edit setup checklist (#8214)
- 164579f feat: update group table (#8194)
- 8f5e883 fix: Multi-trial visualizations switch from metricType to group string [DET-9896] (#8137)
- 834bd2f chore: remove WebSocket actor (#7552)
- b6e28d7 fix: prevent extra updates when observing settings store (#8212)
- 707f77d fix: Trigger function updates new user modified_at without error (#8210)
- f9c8a86 chore(actors): refactor k8s' resource_pool.go without actors (#8186) [DET-9657]
- ea3d6bf docs: sort articles by weight custom extension (#8208)
- 464fb54 docs: Create new advanced setup section (#8203)
- dd1c6f0 docs: toctree tile css (#8207)
- 53b9bdb fix: CLI uses default where pagination not included in args [DET-9908] (#8192)
- 079304a fix: filter agents/nodes by poolName (#8205)
- 3af965f chore: move ui kit to separate repo (#8104)
- 8735acf docs: remove references to PyTorchTrialContext.from_config() (#8187)
- 03383f2 fix: fix issue with db migrations (#8193)
- 2ff18bc fix: Display byte axis values using humanReadableBytes [DET-9906] (#8189)
- bd5e628 feat: add "Topology" section to the cluster UI (#8108)
- 6dfde99 fix: icons should appear in safari (#8190)
- 36ad3ca chore(actors): refactor pods.go without actors (#8170) [DET-9901]
0.26.2
Release Notes
Changelog
- 85b5135 chore: bump version: 0.26.2-rc4 -> 0.26.2
- 25e578d docs: add release notes for 0.26.2 (#8245)
- 1f7945e chore: bump version: 0.26.2-rc3 -> 0.26.2-rc4
- 87c11fb fix: inaccurate task queue time [DET-9912] (#8225)
- 07ea3b2 fix: pin ruamel.yaml<0.18.0 (#8232)
- 6e8a762 docs: quick fix for version dropdown (#8223)
- 8896d23 fix: remove 'contents' parameter from remove_notes (#8209)
- b3fdf9e chore: bump version: 0.26.2-rc2 -> 0.26.2-rc3
- 5585f38 fix: prevent extra updates when observing settings store (#8212)
- 826f633 chore: bump version: 0.26.2-rc1 -> 0.26.2-rc2
- 829c3bf fix: Trigger function updates new user modified_at without error (#8210)
- 0b0d268 fix: CLI uses default where pagination not included in args [DET-9908] (#8192)
- b0071c2 chore: bump version: 0.26.2-rc0 -> 0.26.2-rc1
- f049bd6 fix: fix issue with db migrations (#8193)
- 1f4bbe2 fix: icons should appear in safari (#8190)
- 5d3a80d chore: bump version: 0.26.2-dev0 -> 0.26.2-rc0
- 57fee58 chore: lock published urls to preserve redirects
- 883135a chore: various deprecations to standardize SDK get/list/iter (#8165)
- 0ab908e chore: add release notes for Python SDK (#8184)
- 2c09e10 docs: fix redirects (#8188)
- bfd20f1 docs: update experiment config reference for records_per_epoch in PyTorchTrials (#8185)
- 1fb50cd fix: pin icon should adapt to theme colors (#8183)
- aff81a5 chore: migrate SDK to a generic OrderBy [MLG-1056] (#8171)
- 88549ee Revert "feat: use max_results in SDK's list_trials (#8173)" (#8182)
- c29d086 fix: tensorboard sync for profiler data, Core API v2 managed mode [MLG-1063] (#8163)
- 8324bf2 fix: Change oicd client secret env var name to comply with naming convention (#8113)
- 3946185 docs: Restore path to singularity file (#8180)
- d12373f chore: reinstate core_api example e2e tests (#8148)
- 16ef0bb feat: use max_results in SDK's list_trials (#8173)
- ee2e633 refactor: add delete cascade to tables affected by experiment deletion (#8016)
- 404831b chore: final trial actor refactor (#8164)
- eb98aeb fix: case insensitive member search (#8166)
- b76dd90 fix: Learning curve point click (#8155)
- 95f2ad7 chore: task resources actor refactor (#8157)
- 1bff62e chore: configure a training port offset (#8125)
- e4d5ad1 docs: Modify the info architecture (#8100)
- 556022a chore: change CSS values (#8151)
- 88d9533 fix: Theme dropped after page refreshing (#8139)
- 515f128 refactor: update UI Kit Icon component [WEB-1699] (#8122)
- 82b4d31 chore: Post pruning hotfixes (#8141)
- 67f6dc1 Python SDK v1 (#8005)
- 3a24611 docs: fix python-sdk reference syntax (#8146)
- e7247a5 docs: Restore core api integer incrementing tut (#8115)
- 2a7f141 chore: add option to proxy requests to internal service (#8044)
- d40dee3 fix: Move setting of playwright browsers path (#8144)
- 31c2515 chore: update metrics documentation (#8118)
- b973356 chore: Add Message component to UI kit [WEB-1056] (#8133)
- 5df3d57 chore: remove actors from resource manager interface (#8126)
- f9a61b4 chore: upgrade to node v20 and npm audit [WEB-1662] (#8036)
- 38ff7b5 docs: update link in prometheus docs (#8138)
- ea8d760 chore: Examples pruning (#8140)
- 8b9ec03 fix: move couldn't-connect-to-master message into ship_logs.py (#8127)
- f52f7c6 docs: Apply style guide edits (#8134)
- 2f65eec chore: bump version: 0.26.1-dev0 -> 0.26.2-dev0
- ee3c478 docs: add release notes for 0.26.1 (#8131)
- 5d55da1 ci: dannys/reenable docs autoassign (#8012)
- 017e0f2 docs: Remove ref to fluent bit (#8128)
- 42535c2 chore: add hf context to all dist e2e [DET-9893] (#8119)
- 7e6689b chore: make agent.yaml readable by default (#8095)
- 8d2b5ce fix: hotfixes for ship_logs.py (#8116)
- 233cd95 feat: add allocation exit status to db and implementation for get allocation (#7897)
- 22f1f92 chore: remove task allocation group actor ref (#7853)
- 42bfb72 fix: Return ResourcePools with a fixed order (#8103)
- 3b42b91 chore: write log shipper in python [MLG-993] (#7974)
- ae7c415 fix: DET_CERT_MASTER_FILE=noverify det shell open (#8110)
- 3e1b83e docs: redirects.py moves subdirectories properly (#8111)
- 29aa772 fix: fix primary key for allocation_accelerators table (#8106)
- f9e9a45 build: print sphinx-build command on make build (#8109)
- a51ef7c fix: update WorkspaceMemberAddModal when new user or group is created [WEB-1111] (#8069)
- ff9b77a fix: trigger
autoupdate_users_modified_at
byusername
change (#8093) - 4899df7 feat: Edit experiment from list (#8086)
- d638303 feat: Dont allow users to add deactivated users to a workspace (#8073)
- f559eee fix: single point different axis ranges (#8096)
- 0c54010 fix: Empty / NotEmpty operators for descriptions, tags [WEB-1751] (#8090)
- 469c2ae perf: improve task stats IMAGEPULL performance (#8067)
- e7751fe chore: auth task logs [DET-7554] (#8089)
- 588d817 chore: cleanup allocation exit logic (#8088)
- e2b51e7 feat: add implementation for get and set acceleration data api and intg test [DET-9748] (#7856)
- 5483d12 fix: ignore
last_auth_at
to updatemodified_at
on users table (#8091) - 0f8c1d4 fix: Trial data loading state (#8083)
- 6a91c3d fix: Metric type is blank in comparison chart (#8085)
- 1819810 chore: allow attaching select dropdown to select container (#7940)
- e5b2e35 refactor: removed reference removed agents/./slots/. endpoint (#7424)
- ecc0f6b chore: support "formData" in swagger bindings parsing (#8078)
- 9dfc0a3 feat: Hide deactivated user in ws members list (#8077)
- 89c8da2 feat: introduce batch actions in user management page (#8056)
- ca63335 docs(performance): add sections to help with getting started (#8079)
- 9ea05a1 chore: change master config to k8s secret (#8053)
- 58fab08 fix: Experiment name settable in fork config (#8081)
- 6006d42 fix: Show experiment loading state (#8066)
- 8073a2d fix: order of API paths in proto for AssignMultipleGroups (#8080)
- 5be764b feat(performance): implement slack report send [INFENG-234] (#8045)
- 18d7286 fix: shell open gives unfriendly message on terminated (#8074)
- fc2e8bf test: fix test_pytorch_parallel logs check (#8064)
- ca763a4 fix:
det deploy aws list
results were incomplete. (#8062) - c8edfab chore: cleanup pod informer logging (#8072)
- 14bd162 docs: quick fix for version dropdown (#8070)
- bcfdeac chore(tests): workaround data races in uptrace/bun (#8065)
- 5d56e95 chore(tests): fix data races in webhook integrations (#8061)
- ce5bcf2 ci: backend owns e2e_tests folders cluster, command, and template (#8037)
- 3075820 chore(tests): fix data races in streaming API intg tests (#8059)
0.26.1
Release Notes
Changelog
- a6b26b0 chore: bump version: 0.26.1-rc3 -> 0.26.1
- 4bd3dcb docs: add release notes for 0.26.1 (#8131)
- de1526b chore: bump version: 0.26.1-rc2 -> 0.26.1-rc3
- 6e19285 fix: Return ResourcePools with a fixed order (#8103)
- 355cb62 chore: bump version: 0.26.1-rc1 -> 0.26.1-rc2
- 3f97397 fix: Trial data loading state (#8083)
- 740e730 fix: trigger
autoupdate_users_modified_at
byusername
change (#8093) - 288db9f fix: single point different axis ranges (#8096)
- d972837 fix: Empty / NotEmpty operators for descriptions, tags [WEB-1751] (#8090)
- 3c5b281 fix: ignore
last_auth_at
to updatemodified_at
on users table (#8091) - b086b84 chore: bump version: 0.26.1-rc0 -> 0.26.1-rc1
- 2283723 fix: Metric type is blank in comparison chart (#8085)
- 90dfb54 fix: Experiment name settable in fork config (#8081)
- ce7549a fix: shell open gives unfriendly message on terminated (#8074)
- 663f4d8 docs: quick fix for version dropdown (#8070)
- 180d1b3 chore: bump version: 0.26.1-dev0 -> 0.26.1-rc0
- 1a96b8a chore: lock published urls to preserve redirects
- d1c3fa4 Revert bump version (#8063)
- 9b3b65a chore(tests): fix data race in grpclog init during integrations (#8058)
- ef0ef50 chore: lock published urls to preserve redirects
- b241b78 chore: bump version: 0.26.1-dev0 -> 0.27.0-dev0
- b14cc40 test: stable diffusion example tests [MLG-903] (#7855)
- 4cb3151 docs: Improve upgrade instructions (#8032)
- aa736a2 fix: detached mode tensorboard storage support [MLG-872] (#7992)
- d11f6c9 fix: fake cert gen.sh generated broken certs (#8055)
- a8a3399 fix: bring back logging for
core.train.report_*
calls. (#7975) - f4524ac fix: pass final forked config to server for new experiment (#8051)
- 5ff6e9c feat: update InlineForm inputs (#8033)
- 9c05b00 feat: updating agent user group affects users.modified_at (#8052)
- 9414539 fix: fasterrcnn image not found [MLG-516] (#8047)
- 6aca426 chore(tests): fix data races in k8s tests (#8049)
- 764f407 chore(tests): fix data races in telemetry tests (#8048)
- 90fb051 chore: rename 'cancel' to 'stop' for experiments and trials [WEB-291] (#8038)
- a4c244e fix: command resolve pool defaulting to workspace 0 instead of 1 (#8050)
- ccefc6d chore: allow passing clusterID in master config (#8042)
- 1214a2b fix: Return ResourcePools rather than names (#7990)
- 7aa8541 fix: govcloud agent AMIs are out of sync with bumpenvs [MLG-986] (#7983)
- be1f734 chore: add DatePicker to UI Kit [WEB-1674] (#8040)
- 5bd7b8a feat: SDK can list workspaces. (#7765)
- c0467b0 ci(performance): create initial gha workflow [INFENG-224] (#7969)
- 9dcedad fix: k8s custom pod spec affinity would get ignored (#8043)
- f4d3c47 fix: Note UI respects project permissions (#8028)
- 338d2f6 fix: rename
last_login
tolast_auth_at
(#8022) - 34373ac feat: Batch actions for multiple users into one request [WEB-1640] (#7971)
- 1d85304 test: fix shell open test flake (#8035)
- 827d9a6 feat: Filter user list by role id for EE (#7988)
- e57b8d6 fix: Hide checkpoint deletion btn when already deleted (#8039)
- 04609e0 fix: Remove the unexposed GetJobQStats from RM interface and all RMs (#8030)
- d00ddc9 chore: Add error case and tests to Loadable [WEB-1333][WEB-1711] (#8025)
- 78dfc83 fix: Support longer titles on HParam scatter plots (#8031)
- fd9b406 fix: nil ptr for Proto() on users who haven't logged in (#8029)
- 846fbc3 fix: add css formating for miltiple input errors (#8011)
- 42a4911 chore: adding an example for distributed batch inference for mnist (#7976)
- c83dfe6 docs: fair-share scheduling policy [skip ci] (#7981)
- ccfda89 fix: Add fetch to resource pool bindings page (#8023)
- 4b085ff chore: move Loadable to kit [WEB-1688] (#7973)
- 67fdacb fix: sort files for conflict resolution in sharded checkpoints (#8014)
- 10fcb10 feat: add tooltip linebreak to the project card (#7995)
- 4b4ad8d fix: add existing "non-setting" query parameter into the settingsToQuery function (#8013)
- f021ac2 test: fix master intg user flake (#8019)
- 8937f2b fix: align default markdown font-family to theme [WEB-617] (#8009)
- e03e469 feat: "det notebook|shell|tensorboard open" doesn't error when task not ready (#8008)
- 109e69f fix: tensorboard deletion when det e delete [DET-9844] (#7997)
- e3ffc4f docs: Fix minor issues (#7999)
- 2e77f52 fix: regular integer spacing of chart ticks [WEB-1714] (#8010)
- f1ae472 feat: add
last seen
column in user management table (#7991) - d2a5505 refactor: remove external dependencies from UI kit [WEB-1689] (#7968)
- e86abd5 fix: fix sorting in GetUsers endpoint (#8001)
- 69921eb feat: commands download user files at startup so k8s can support larger context directories [DET-8830] (#7889)
- 86e6f47 fix: report searcher progress according to reporting period (#8006)
- 5b2f238 ci: more precisely select files for splitting in E2E tests (#7989)
- f918188 fix: show deep files in experiment code viewer (#7945)
- 1d2b60b fix: data fetch shouldn't interrupt editing model version description [WEB-1703] (#8000)
- 7dff1f2 chore: bun debug mode off (#7996)
- 0241795 fix: Version dropdown in docs is scrollable (#7994)
- 46da826 ci: disable docs review action [skip ci] (#7982)
- 390e0ac chore: add tests for postgres_users.go (#7875)
- a60c0d7 fix: Hide action menu on dashboard project cards (#7986)
- a2fb7ef chore: bump version: 0.26.0-dev0 -> 0.26.1-dev0
- 7eaf361 docs: add release notes for 0.26.0 (#7987)
- 25439fc chore: put grpc panics and other logs into 'master logs' (#7965)
- 8c373cf fix: Force GCP node name length to be less than maximum length (#7964)
- 7f21b4f chore(templates): refactor templates to their own package (#7876)
- b9fb3eb chore: Add toast to UI kit (#7950)
- d45e22b feat: Support filter by status and role for users (#7953)
- 5b5fb2f chore: fix det deploy aws requiring --db-size (#7984)
- e254bf3 docs: Update custom pod specs page (#7970)
- 3d1ec1f fix: Update Tasks Stats Causes Deadlock [DET-9853] (#7980)
- 5c5e0ec feat: single experiment continue [DET-9703] (#7764)
- 34c5bdb docs: document prometheus auth (#7957)
- 6e3cad3 chore(codeowners): map performance dir to web team (#7916)
- 9a5cfd7 fix: handle nil actor message and nil actor errors in agent RM (#7951)
- d25612b feat: Add instance flavor and size arguments for det deploy aws [INFENG-227] (#7931)
- e031a2f fix: update
modified_at
by insert in user table (#7949) - 2d65e82 fix: Change measure of text lines in log containers [WEB-1664] (#7860)
- 14a779c chore: add last_login column to users table/model (#7948)
- f4e3638 chore: fix docs reference in cli [MLG-891] (#7926)
- a4fdd0a chore: Improve failure diagnostics in shell test [FE-216] (#7932)
- b576b6b chore: less verbose mockery output (#7822)
- 9df980c chore(performance): add initial Makefile and README (#7914)
- 47a4070 chore: handle case where steps completed is more than max length (#7816)
- 1193acd chore: log k8s nil event objects at trace level and ignore (#7962)
- 1c86762 chore: max_slots_per_pod can be per resource pool [DET-9771] (#7923)
- 3267b1f fix: return searcher_metric_value as-is (#7961)
- b5845b7 fix: singularity agent env variable (#7960)
- 19d703b fix: mitigate user settings race conditions (#7905)
- f4382a7 fix(db): handle erroneous nulls from the summary metric migration (#7958)
- 65cabae fix(experiments): don't transition experiment to "" state on crash (#7956)
- aa30e86 docs: quick fix for version dropdown (#7952)
- 18e2f1f fix(scheduler): tolerate missing groups in priority scheduling by skipping them (#7947)
- 4b19928 chore: bunify & tidy up internal/user (#7886)
- 3f067a3 fix(allocation): allocation lifetimes should contain resource lifetimes (#7944)
- b6b5a84 Remove references to --auto-bind-mount (#7910)
- 2ba2580 chore(deps): bump tibdex/github-app-token from 2.0.0 to 2.1.0 (#7938)
- 2baaf30 fix: clear selection after action (#7921)
- 4db1c08 refactor: css in docs (#7934)
- a392dc8 fix: button in 404 page (#7936)
- 0e3f4d0 fix: avoid hiding tabs in single trial experiment [WEB-1651] (#7941)
0.26.0
Release Notes
Changelog
- 29705a8 chore: bump version: 0.26.0-rc3 -> 0.26.0
- 084e485 docs: add release notes for 0.26.0 (#7987)
- 2882e78 chore: bump version: 0.26.0-rc2 -> 0.26.0-rc3
- 623774e fix: Update Tasks Stats Causes Deadlock [DET-9853] (#7980)
- c11c5e4 fix: handle nil actor message and nil actor errors in agent RM (#7951)
- df6c317 fix: update
modified_at
by insert in user table (#7949) - 9a795c1 chore: bump version: 0.26.0-rc1 -> 0.26.0-rc2
- d330154 chore: log k8s nil event objects at trace level and ignore (#7962)
- a8465d8 fix(db): handle erroneous nulls from the summary metric migration (#7958)
- ab9d933 fix(experiments): don't transition experiment to "" state on crash (#7956)
- 7b506f9 fix(allocation): allocation lifetimes should contain resource lifetimes (#7944)
- 2b12176 chore: bump version: 0.26.0-rc0 -> 0.26.0-rc1
- 78482b2 docs: quick fix for version dropdown (#7952)
- 0893e30 fix: clear selection after action (#7921)
- a9a382f fix: button in 404 page (#7936)
- 7f85454 chore: bump version: 0.26.0-dev0 -> 0.26.0-rc0
- 9ac8f82 chore: lock published urls to preserve redirects
- 6ba27fe chore: lock api state for backward compatibility check
- b52b3a6 chore: bump version: 0.25.2-dev0 -> 0.26.0-dev0
- c2cea7d chore: include api op and param description in py bindings (#7798)
- c98cc07 feat: Allow passing in swagger json as an argument (#7843)
- 482285f docs: Add another top nav link (#7933)
- d9e1bb5 chore: track dead code [WEB-258] (#7924)
- 537cc3d docs: Update launcher version to 3.3.8 for consistency with docs (#7915)
- 5b859c6 fix(cli): det model describe should call GET /model not GET /models (#7912)
- 262c33a docs: Clarify weighted fair-share scheduling policy (#7913)
- 4a486bb feat: Add performance tests for endpoints used in the WebUI initial load [WEB-1459] (#7906)
- cc360ac feat: Add workspaces to the SDK client (#7883)
- c87ca94 fix: api_command.go does not merge map values when overrides TaskContainerDefaults [FE-114] (#7887)
- dc40688 chore: update docs ownership per discussion [INFENG-225] [skip ci] (#7907)
- ee79213 test: fix e2e_tests ray dependency. (#7925)
- 688ff63 fix: align items in task list (#7894)
- 694f44a feat: submit forms in modals by pressing enter [WEB-1130] (#7857)
- 376ea50 fix: Display data point in line chart when epoch is 0 (#7898)
- 8eff8ac chore: update user docs (#7902)
- 4441ac8 feat: Input should capture the Esc button and Clicks while focused [WEB-1251] (#7859)
- 0365ca7 revert: "chore(actors): remove pkg/actors usage from pods.go (#7658) [DET-9652]" (#7908)
- c1a0cf4 chore(actors): remove pkg/actors usage from pods.go (#7658) [DET-9652]
- ff2e16d fix: NTSC use workspace's agent group info (#7892)
- 8bef0d4 chore: no code owners for auto-generated files (#7896)
- 5278758 chore: increase Go's max line length to 120 (#7903)
- b09334d feat: Add display name to user list in cli [MLG-930] (#7901)
- 4301fc2cc feat: move UI related files to the UI kit. (#7852)
- 3f9a980 feat: Hide code related actions based on model definition size (#7854)
- dde10f1 Revert "ci: temporarily move e2e to only nightly [skip ci] (#7837)"
- 35aa028 feat: add an API to get an allocation's exit status (#7731)
- e56ed43 chore: prompt for docs in github question template (#7895)
- 4d827cd fix: remove unused go code (#7893)
- b93f0c9 feat: disable actions of unmanaged experiments/trials (#7874)
- 3b1bebf fix: Metadata deleting last row, cancelling delete [WEB-1655] (#7805)
- 9cddca9 Refactor: Use userSettings store in learning curve (#7783)
- 4a6afb1 feat: add config option to omit default resource pools (#7885)
- e5555ca fix: redefine user columns updated in postgres_users toUpdate (#7890)
- 29561a8 test: quarantine nightly cifar10-keras convergence test (#7780)
- 1cfc7f3 fix: Project move/delete updates UI state [WEB-1668] (#7870)
- 00dfcca test: enable command run tests for hpc (#7880)
- 90d66b8 chore: disable interactive matching for dev bindings (#7747)
- b6632d1 chore: postgres_users.go bun migration [DET-8238] (#7769)
- 826f2b4 feat: containerize performance tests [INFENG-222] (#7863)
- 66f6f4a ci: fix webui test results upload (#7877)
- 64ec5ed Revert "feat: add config option to omit default resource pools (#7696)" (#7878)
- b95d57f fix: use setPartial in experiment list setting (#7873)
- e18d5cf feat: add config option to omit default resource pools (#7696)
- 5e69b6c feat: k8s agent enable disable [DET-9750] (#7779)
- d2e5abb docs: Remove black borders on gif (#7872)
- 144dd0f test: remove deepspeed marks from dsat tests (#7871)
- 40b4341 docs: Add gif to the Readme (#7865)
- a5b29cf docs: Adjust diagrams replacing fluentbit icon (#7867)
- 096935f chore: update user docs (#7864)
- 6d5ad2d docs: Add page for using Determined Agent on Slurm/PBS (#7866)
- 79060ba chore: Remove imagenet (#7664)
- cdceeac feat: Display unmanaged experiments with label (#7861)
- 16a6262 fix: Make models list editing work via ModelActionDropdown [WEB-1603] (#7799)
- 84a6612 fix: doc url in jupyter config modal (#7862)
- 5e04699 fix: fix how we are calling the bert embedding example (#7851)
- d8c7bd2 docs: Clarify meaning of trial api (#7818)
- 813ed36 fix: error message for
det agent [enable|disable]
. (#7839) - 9132dcb feat: expose
externalExperimentId
andexternalTrialId
(#7840) - 9874951 chore: bump version: 0.25.1-dev0 -> 0.25.2-dev0
- a5bdfa7 docs: add release notes for 0.25.1 (#7850)
- 38dc440 chore: trial actor refactor (#7821)
- f3aaf4d fix: pass configString once to createexperimentmodal (#7849)
- 346d4aa chore(deps): bump tibdex/github-app-token from 1.8.2 to 2.0.0 (#7847)
- 93e5341 feat: helm ca.cert injection, cluster-wide non-namespaced res creation flag, password change and minor-fix (#7808)
- 442bac6 feat: backend support for inference metric tracking part 2 (#7592)
- 06080b9 feat: allow metrics with duplicate keys and the same value [MLG-890]. (#7820)
- e42c973 feat: enable display of metrics with floating point epoch [MLG-857] (#7829)
- 3c9e0e2 feat: add new API endpoint to get and post accelerator data (#7723)
- febbe18 fix: enable RP bindings management for workspace admins (#7834)
- e843173 ci: temporarily move e2e to only nightly [skip ci] (#7837)
- 4956673 fix: display
progress
value as it is (#7836) - 4d74e95 refactor: flipped k8's enable reattach to always true [DET-9726] (#7692)
- 4010b74 chore: nil exception on GetResourcePoolsRequest error (#7835)
- 12d393f fix: dupe checkpoints (#7833)
- 6b12390 fix: log viewer not updating when page switched (#7823)
- 590ea21 ci: fix check-rebaseable syntax [ci skip] (#7826)
- a99385d ci: Add a newline to the output for pre-check (#7824)
- c2ce179 chore: support binary output via dev curl (#7778)
- d78bafe fix: SSO button text color (#7819)
- b4f6f0c fix: correct useResize hook to return proper element sizes [WEB-1656] (#7807)
- e31a077 chore: Split out partial updates into setPartial (#7815)
- 8bcee31 chore: make pre-commit dev setup opt-in. (#7774)
- 675de43 chore: minor copy change (#7810)
- dceb00c chore: agent device discovery too greedy (#7802)
- 1284914 chore(deps): bump actions/checkout from 3 to 4 (#7786)
- 9651c9b fix: progress filter in exp (#7811)
- c284b09 fix: lower severity of allocation log changed when debugging (#7803)
- ff830f9 fix: Learning curve will send falsey metricType (#7809)
- 3c2ab1e chore(deps): bump tibdex/github-app-token from 1.8.0 to 1.8.2 (#7772)
- da23134 docs: HPC launcher doc tweaks, add image scheme docker-archive:// (#7812)
- 1a89c56 docs: Add sections on HPC upgrade and package verificaiton (#7804)
- a51892e fix: Avoid dropdown repeating in ExpList fields dropdown [WEB-1598] (#7800)
- fab413b chore: tools/k8s doesn't use coscheduler (#7795)
- 2b95373 docs: Update the installation guide (#7762)
- 8291a18 ci: quarantine some flaky nightlies (#7725)
0.25.1
Release Notes
Changelog
- 39a421a chore: bump version: 0.25.1-rc2 -> 0.25.1
- 61c11df docs: add release notes for 0.25.1 (#7850)
- e0d0ed2 chore: bump version: 0.25.1-rc1 -> 0.25.1-rc2
- 74eeb77 fix: enable RP bindings management for workspace admins (#7834)
- 1d8e3d2 fix: display
progress
value as it is (#7836) - 2c86593 chore: bump version: 0.25.1-rc0 -> 0.25.1-rc1
- 117b173 fix: log viewer not updating when page switched (#7823)
- b93bc72 fix: SSO button text color (#7819)
- cfdacb4 fix: correct useResize hook to return proper element sizes [WEB-1656] (#7807)
- 81b673d fix: progress filter in exp (#7811)
- 29ad1d1 fix: Learning curve will send falsey metricType (#7809)
- b0a7e4e fix: Avoid dropdown repeating in ExpList fields dropdown [WEB-1598] (#7800)
- ebd1906 docs: Update the installation guide (#7762)
- 1bf08e5 chore: bump version: 0.25.1-dev0 -> 0.25.1-rc0
- 7f7e89b chore: lock published urls to preserve redirects
- 59ebdf0 chore: lock api state for backward compatibility check
- 9b6c6c7 fix: Get distributed jobs working with devcluster [FE-181] (#7785)
- d786078 chore: revert trial actor refactor (#7797)
- f4ca02a docs: quick fix for version dropdown (#7796)
- 72d34d9 chore: reduce master log noise (#7794)
- 12a513c chore: Create/document a mechanism to run the nightly tests on a PR [FE-146] (#7750)
- af24954 Revert "chore: track dead code [WEB-258] (#7767)" (#7793)
- c815f76 fix: handles custom TLS certs in enrich_task_logs.py [DET-9803] (#7782)
- 5c83901 chore: remove empty
determined/common/api/checkpoint/
. (#7776) - a2b873f chore: suppress the daemonize message on HPC jobs (#7775)
- 5f2f6b8 fix: glitchy width in code editor (#7771)
- bdeb0ea chore: track dead code [WEB-258] (#7767)
- 2167292 chore: trial actor refactor (#7559)
- 06e361e fix: correct date range for avg queued time charts [WEB-1621] (#7754)
- 7c765ae refactor: remove fluent bit & replace with slurm log shipper [DET-9704] (#7639)
- 214198d fix: include
unmanaged
field inGetExperiment
. (#7768) - f8caa0e feat: Create performance tests [WEB-1458] (#7741)
- b0badb2 fix: Handle chart x-axis with all points at x=0 [WEB-1622] (#7760)
- a6d0fba chore: rearrange log level constants (#7752)
- b22f652 chore: ignore flake8 import restrictions pre-commit check (#7759)
- ef8a295 chore: bump version: 0.25.0-dev0 -> 0.25.1-dev0
- 9333c9d docs: add release notes for 0.25.0 (#7756)
- 8af14ab chore(actors): refactor pod.go (#7617)
- 046e060 test: make error checking case insensitive fixing rbac test (#7749)
- 6c530d3 build: fix
go-version-check
command (#7751) - 418931b refactor: make glide-table conform to standard event handler pattern and fix paginated row selection bug [WEB-1471, WEB-1561] (#7704)
- 93d861d chore: use Message for no data in ComparisonView (#7654)
- 4d25428 docs: tweak brew instructions (#7743)
- cf57ce4 chore: upgrade go 1.20 to 1.21 (#7657)
- 0553f19 fix: make rbac messages consistent (#7745)
- e1675b2 fix: not all resource pools should be labeled "default" [WEB-1600] (#7744)
- 2e907ce fix: resource pool card workspace tweaks (#7732)
- 5154c3b fix: proxy tunnel server should use
SO_REUSEADDR
. (#7735) - e845836 fix: React build issue (#7742)
- 2fc5f57 fix: Faster polling for first experiment metrics [WEB-1576] (#7740)
- ad46c44 chore: add a new assertion method to check command exit status and report any errors (#7737)
- 418d5ae fix: allow deletion of workspaces when case-insensitive matches exist (#7738)
- 690d451 docs: Reorganize model dev guide sidenav (#7713)
- 9936984 fix: properly display group metrics in metrics tab charts [WEB-1604] (#7727)
- 15e150b fix: allow zeroes for user agent id and group agent id (#7730)
- 8c84750 fix: catch correct import error and set tensorboard logging to false for --test --local (#7715)
- b6f4f30 fix: allow NodeInformer to fail with permission error [DET-9772] (#7703)
- 00a0bc3 fix(cli): not found errors should retain useful context (#7733)
- 688ea88 fix: fix failing e2e_cpu tests (#7734)
- 613e0ce chore: New constructor for Determined objects using existing session. (#7663)
- 108ffea fix: backfilled tasks weren't seen as trial tasks (#7729)
- 1fcd2f9 docs: Add user guides to the Documentation section (#7721)
- d20f577 fix: changing x axis type should reset any current custom zoom (#7728)
- ef5ae83 chore: update determined cli to handle timestamp format for external jobs (#7668)
- 2eadef1 feat: show external jobs on the resource pool page (#7666)
- 2a570bb chore: crash cluster given RM crash (#7621)
- b66ff4a fix: correct GPU name for A100-80GB. (#7724)
- fdddcbf chore: Add nightly tests to release branches (#7720)
- ae6c927 fix: reset chart min/max when changing xaxisdomain (#7719)
- 5e6af2a fix: properly encode metric to keys for LineChart and ParallelCoordinates (#7714)
0.25.0
Release Notes
Changelog
- fea5014 chore: bump version: 0.25.0-rc7 -> 0.25.0
- 3201f27 docs: add release notes for 0.25.0 (#7756)
- 29fbea2 chore: bump version: 0.25.0-rc6 -> 0.25.0-rc7
- 16509f6 test: make error checking case insensitive fixing rbac test (#7749)
- 79a5faa chore: bump version: 0.25.0-rc5 -> 0.25.0-rc6
- 9fde7c7 fix: not all resource pools should be labeled "default" [WEB-1600] (#7744)
- 154168f fix: resource pool card workspace tweaks (#7732)
- d3a42d7 chore: bump version: 0.25.0-rc4 -> 0.25.0-rc5
- 41f9251 fix: React build issue (#7742)
- 1c81e4c chore: bump version: 0.25.0-rc3 -> 0.25.0-rc4
- c4443b3 fix: make rbac messages consistent (#7745)
- 1a83243 fix: allow deletion of workspaces when case-insensitive matches exist (#7738)
- f298ebf fix: properly display group metrics in metrics tab charts [WEB-1604] (#7727)
- 450d1b5 fix: allow zeroes for user agent id and group agent id (#7730)
- fd13b29 chore: bump version: 0.25.0-rc2 -> 0.25.0-rc3
- c121387 chore: bump version: 0.25.0-rc1 -> 0.25.0-rc2
- 1cea783 fix: allow NodeInformer to fail with permission error [DET-9772] (#7703)
- 5d5718f fix(cli): not found errors should retain useful context (#7733)
- 98e0621 fix: backfilled tasks weren't seen as trial tasks (#7729)
- 306baaa fix: changing x axis type should reset any current custom zoom (#7728)
- 406656f chore: bump version: 0.25.0-rc0 -> 0.25.0-rc1
- 00f3af9 fix: correct GPU name for A100-80GB. (#7724)
- 4dd33e4 fix: properly encode metric to keys for LineChart and ParallelCoordinates (#7714)
- a028cd2 chore: Add nightly tests to release branches (#7720)
- 17796e3 fix: reset chart min/max when changing xaxisdomain (#7719)
- 05f808b chore: bump version: 0.25.0-dev0 -> 0.25.0-rc0
- 1cb537e chore: lock published urls to preserve redirects
- 7196181 chore: lock api state for backward compatibility check
- 6e39429 chore: bump version: 0.24.0-dev0 -> 0.25.0-dev0
- 1c8ce3f fix: add missing workspace_id from get_templates (#7706)
- efdc70b fix: code cleanup for mapx unit test (#7710)
- 0a34529 fix: add unit test cases for mapx methods Values and Clear (#7699)
- 1a4bee4 feat:
det deploy gcp
support for a2-ultragpu and g2-standard. (#7702) - 2fd2535 fix: users can see inaccessible RPs (#7707)
- e83660c fix: rp bindings intg test failure (#7701)
- f1e9b72 chore: Remove estimatortrial (#7700)
- e2f2173 feat: replace clone function with structuredClone and add polyfill (#7624)
- f528cc6 fix: botched rebase/rename in the detached mode. (#7695)
- 08de858 fix: Continue Trial modal does not reset mode [WEB-1566] (#7688)
- f9c5600 fix: error message in jupyter (#7693)
- 2607fd1 fix: patch workspace has duplicate update statements (#7697)
- bf1b87b fix: correct outstanding error in mapx (#7698)
- cdc41e9 fix: add type check to pod spec merge (#7691)
- e340d50 chore: add Values and Clear methods for mapx (#7669)
- 6bc3c68 docs: algolia scraper to scrape only xml (#7690)
- 0977986 docs: fix new release notes (#7694)
- 76134b2 chore: dev cli support for calling master apis (#7462)
- 43c715b docs: add release notes for 0.24.0 (#7680)
- c513e70 chore: add new RBAC permission view external jobs (#7671)
- 128b106 docs: work around bug causing version dropdown to fail (#7685)
- 26559ed feat: check if default resource pools are bound (#7687)
- 7d8fce5 docs: improve writing of the github readme (#7689)
- 857309f feat: add rp bindings permissions (#7673)
- 5c68568 chore: api intg tests [DET-9725] (#7589)
- 1ad812d docs: Improve the GitHub Readme (#7613)
- 8239b19 fix: default pools editable and submittable (#7647) (#7672)
- 4f60d64 chore: unpin click version (#7684)
- ec90842 chore(deps): bump arduino/setup-protoc from 1 to 2 (#7537)
- b4cbe9c chore: enable mask closable by default for drawers (#7676)
- b1f02ed chore: limit reported slots (#7683)
- 75c1f17 fix: tensorflow version for macos (#7679)
- 0a5b406 fix: allow special characters in user manangement filter (#7681)
- 1af32e3 feat: Update user.modified_at when user added or removed from groups (#7665)
- 23a8224 fix: Case-insensitive client-side username search [DET-9770] (#7677)
- a368a55 chore: limit reported slots (#7648)
- 60a07e4 fix: make -C master clean build [DET-9333] (#7660)
- fcf7807 chore: custom metrics group in new experiment list (#7518)
- dc006b1 docs: Fix formatting (#7670)
- 6ee0521 docs: Introduce users to pachyderm w det (#7661)
- 4ef8b20 feat: detached mode v1 / core api v2. (#7060)
- d580ecf fix: allow checkpoints to be GCed without validation metrics and add tests (#7653)
- 204caa5 fix: optional chaining in
extractMetricValue
(#7662) - 0ead6ac chore: telemetry actor refactor [DET-9663] (#7585)
- e437bd8 docs: Point to pytorch distributed launcher (#7649)
- 08ff4be docs: fix epoch metrics article (#7643)
- 395aa40 docs: Update resource pool to workspace mapping (#7642)
- 28cbdc2 fix: properly show the pagination for experiment list paged view (#7638)
- 68624f6 chore: avoid creating new table columns for non-legacy metrics (#7656)
- e558063 chore: Add eslint rule for imports to take one line [WEB-1567] (#7650)
- 0bc5dce ci: bump everything to torch==1.11 (#7599)
- 4eb6ebe fix: Project delete/move triggers update of workspace projects list [WEB-1497] [WEB-1377] (#7646)
- 145dd63 ci: indicate GHA run URL when reporting a cherry-pick conflict (#7635)
- 8d2b531 chore: Show Tooltip instead of actions for Default Resource Pools [WEB-1554] (#7644)
- 0bab558 fix: select component width (#7640)
- 32fac33 chore: use eslint rule to avoid relative imports through parent [WEB-1496] (#7637)
- dfd5475 chore: remove unused parseFloat for decoding string metric values (#7641)
- e472ea7 fix: alphabetical binding workspaces and search copy change [WEB-1552, WEB-1553] (#7633)
- 7b96933 fix: properly clear out the settings from the database [WEB-1559] (#7636)
- e9e66b1 fix: fix incorrect return type for downsampled metrics (#7618)
- 930fc9d feat: custom metric groups (formally known as types) [WEB-1469] (#7570)
- 75e93d9 docs: bump rstfmt version (#7611)
- 34c5b5a fix: trigger jobs fetchAll on pagination changes [WEB-1546] (#7602)
- 29e63af Remove say workaround and update version (#7628)
- 0a24176 chore: fix pod-spec merge logic (#7574)
- ce3136a feat: Don't show charts where all series are Loaded(no data) [WEB-1524] (#7609)
- d3c027b feat: OptionsMenu moved to left group (#7623)
- 4d002c4 docs: Add article on how to view epoch metrics (#7504)
- 4327a25 fix: rp binding resolving resource pools (#7629)
- f79e95e ci: fix release branch selection when cherry-picking EE PRs (#7630)
- d6a5c79 fix: Handle metric names finish loading, but still empty (#7634)
- 2844566 chore: support mobile view in UIKit [WEB-1314] (#7626)
- 3ec9c49 fix: button filter text (#7632)
- 0d293b5 fix: ChartGroup vertical spacing (#7631)
- 6fd4b21 feat: replace custom
isEqual
to lodashisEqual
(#7625) - fa91629 feat: add searcher metric sorting (#7614)
- 5ff0b7d fix: avoid converting workspace name to sentence casing [WEB-1548] (#7622)
- 37259d4 fix: treat searcher metrics value as a number in the ui (#7612)
- bbe70e0 feat: Resource pool tab for workspace (#7582)
- 369ddf3 feat: Copy cell value from experiment list table (#7604)
- 3cb434d fix(actors): trial lifetime must contain allocation lifetime, still (#7615)
- 033a9f6 fix: Single-point tooltip closes when mouse exits chart [WEB-1541] (#7595)
- b8f95ad docs: Add css rule to turn off scrolling when clicking on section links (#7610)
- 9994aa3 fix: code editor height issues (#7573)
- acdd6c4 docs: improve a release note (#7601)
- 37cc9f0 chore: add agent --image-root (#7597)
- 69ab985 refactor: trial's can have one or many tasks [DET-9647] (#7355)
- 7c076b2 ci: fix remote name in PR tracking script (#7607)
- b2ee9b3 fix(actors): create valid fake group actor for checkpoint GC, don't leak it (#7606)
- 776be10 fix: fix checkpoint gc which was incorrectly deleting some checkpoints (#7523)
0.24.0
Release Notes
Changelog
- 620162e chore: bump version: 0.24.0-rc5 -> 0.24.0
- 809eda9 docs: add release notes for 0.24.0 (#7680)
- 399ad9c chore: bump version: 0.24.0-rc4 -> 0.24.0-rc5
- b1b2bf2 fix: allow checkpoints to be GCed without validation metrics and add tests (#7653)
- 378fe8d chore: bump version: 0.24.0-rc3 -> 0.24.0-rc4
- c1ffb75 docs: fix epoch metrics article (#7643)
- abf253b make error message more accurate (#7659)
- d57c5ec fix: default pools editable and submittable (#7647)
- b7aa1f4 chore: bump version: 0.24.0-rc2 -> 0.24.0-rc3
- 9095ee8 fix: properly clear out the settings from the database [WEB-1559] (#7636)
- 328ff9a docs: Add article on how to view epoch metrics (#7504)
- 39fa1db fix lint from #7629
- 24b89f7 fix lint from #7634
- 8806cd5 fix: Handle metric names finish loading, but still empty (#7634)
- a104da8 Revert "fix: Handle metric names finish loading, but still empty (#7634)"
- ed5e38c fix: rp binding resolving resource pools (#7629)
- aafcff8 fix: Handle metric names finish loading, but still empty (#7634)
- 6846471 fix: treat searcher metrics value as a number in the ui (#7612)
- 7aea1b4 fix(actors): trial lifetime must contain allocation lifetime, still (#7615)
- 524cc58 chore: bump version: 0.24.0-rc1 -> 0.24.0-rc2
- 1f0b33e fix: fix checkpoint gc which was incorrectly deleting some checkpoints (#7523)
- 57d7bc4 fix(actors): create valid fake group actor for checkpoint GC, don't leak it (#7606)
- 3a20df1 chore: bump version: 0.24.0-rc0 -> 0.24.0-rc1
- d17dbfd chore: bump version: 0.24.0-dev0 -> 0.24.0-rc0
- d7b2f5c chore: lock published urls to preserve redirects
- 4931751 chore: bump version: 0.23.5-dev0 -> 0.24.0-dev0
- bfd964c fix: proto user always has false remote (#7126)
- 5bc1666 chore: fix incorrect make invocation (#7605)
- e960b57 Update README.md (#7587)
- c957354 chore: bumpenvs 24.0 (#7594)
- 3c44a2f fix: ensure selected file path matches loaded file in codeeditor (#7563)
- 84a5800 fix: fetch the latest projects (#7598)
- 50fc3e7 chore: update comment to clarify endpoint behavior (#7583)
- 38a49cf docs: update doc string related to adding enable_tensorboard_logging flag (#7600)
- 549273c feat: Enable disabling Tensorboard logging [MLG-22] (#7508)
- 6d6d5ad chore: Remove ptl adapter (#7591)
- 306e0df fix: remove flaky component if new xp list is active. (#7590)
- db7ae17 ci: Add pytorch2 tests (#7581)
- 62cc08c feat: Move new charts into always-on Metrics tab [WEB-1522] [WEB-1523] (#7542)
- 5cf569d docs: fix lint for docs/architecture/introduction.rst (#7586)
- 7c6aead chore: migrate to singularity --nvccli (#7576)
- 923a742 chore: Make explist_v2 generally available (#7561)
- efc4458 Added Profiling to the Benefits table in the intro (#7580)
- 4ad4a76 fix: quick disambiguation on exp. checkpoint size (#7579)
- d40ae77 chore: fix rph docs url publishing step (#7487)
- 5f21741 docs: edit release notes readme (#7578)
- cd19761 fix: Button icon spacing (#7568)
- 22ca050 feat: get unbound pools endpoint [DET-9696] (#7527)
- d05e530 test: unpinning responses (#7551)
- d63d903 chore: Parse and format APIExceptions (#7531)
- ada554f docs: edit some release notes (#7540)
- 9d44ffc docs: add sphinx-tabs extension (#7577)
- 3623a23 docs: Edit the release notes readme (#7572)
- 04fb7ef chore: summarize state of mock_client_test.go [DET-9731] (#7571)
- 708db2d fix: Accomodate partial experiment list settings (#7564)
- 345da9c chore: add linter for google style guide-compliant python imports. (#7550)
- 2969f53 chore(rm): refactor cluster management APIs into RM (#7569)
- f10b474 chore: update type to handle generic summary metric types [WEB-1538] (#7567)
- 577cebe chore: Add web as codeowner for /webui (#7565)
- 56a59c0 fix: hide
cluster logs
in mobile view (#7557) - 4bcb04b fix: log level filter broken on det t/e logs [MLG-798] (#7558)
- 88338c1 fix: overwrite bindings not working for zero length list (#7549)
- 3258b92 fix: correct name for external jobs (#7566)
- fd7edc8 fix: scrolling when dragging column headers in Glide Table (#7548)
- fbcc4b5 fix: Dont seek min and max on projects with 0 experiments (#7560)
- 6f448b4 feat: Replace sum and count training metrics with mean in new experiment list (#7493)
- de9e4f5 chore(actors): refactor checkpoint GC tasks (#7435)
- da4696b chore: master linting less verbose (#7553)
- 99279ad feat: Provide an interface to enable resource managers to show External jobs on the resource pool queue. (#7070)
- 42feb36 docs: apply minor tweaks (#7554)
- 3534b49 docs: correct some
det deploy gcp
docs facts. (#7556) - f42cdcc chore: add copy to manage bindings modal (#7555)
- 88a93b0 chore: update default exp list columns [WEB-1488] (#7534)
- cc8cb63 fix: Groups modal does not include inactive users [WEB-1256] (#7528)
- b693ddd build: convert svgs to react by default (#7541)
- 01bdf93 feat: Heatmap support for glide table (#7267)
- 277e124 chore: update fmt-sql config and version (#7544)
- db8e965 ci: setup CODEOWNERS for ml-sys team. (#7546)
- 318ffda chore: Add new
update
function to userSettings store (#7469) - bde72d6 chore: use Dropdown for Experiment List menus (#7522)
- fcbc80e refactor: add Spinner to UI Kit [WEB-1451] (#7498)
- 03f503c ci: add a sql fromatter (#7538)
- ec32f2f ci: handle duplicate cherry-picks of a PR to release branch (#7502)
- 3db61f5 fix: compare charts showing no data while loading [WEB-1485] (#7516)
- 43ff696 fix: Improve HPC error shutdown to improve logging [FE-44] (#7488)
- a69ec27 chore: temporarily rollback agent usage of --nvccli (#7533)
- 90e131a fix: fix default pools and refactor (#7535)
- e86d47a fix: use proper experiment project resolution (#7532)
- c7ac817 fix: disable Manage Bindings option for default pools [WEB-1521] (#7519)
- 5f0ff57 fix: custom proxies do not work for trials in slurm [DET-9718] (#7529)
- 4722ee9 fix: filter scrollbar adjustment (#7530)
- 9867d4d fix: Selected experiments in glide table persist [WEB-1366] (#7289)
- 39013e9 chore: add option to syntax highlight cli json output (#7471)
- e75c7f4 chore: rename db metric references to custom_type (#7473)
- fa626b0 chore(actors): refactor allocation actor without actors (#7391)
- 303f0d2 chore: bump version: 0.23.4-dev0 -> 0.23.5-dev0
- 630f721 docs: add release notes for 0.23.4 (#7524)
- 2dd26d5 fix: Don't return a workload for deleted checkpoints [WEB-1505] (#7491)
- e13190a fix: Reset cluster jobs pagination when offset is out of bounds (#7521)
- ab21b6e chore: rollback torch 1.7 support removal. (#7525)
- 56d444e chore: update vite (#7505)
- ec0e750 style: update copy and add dividers to exp list table action dropdown [WEB-1490] (#7506)
- cddaf4f fix: rename checkpoints (#7513)
- 7b6f8e1 chore: remove double newline in cli error messages (#7472)
- 44f03fd fix: det tunnel should work with proxy port exposed [FE-121] (#7492)
- 28d18d9 fix: unbumpenvs. (#7496)
- d868241 feat: Pytorch2 necessary changes (#7515)
- 54c5a7d style: update experiment selection label [WEB-1510] (#7510)
- 0c4a19f fix: rp-workspace mapping RP not found [WEB-1508] (#7514)
- 9deeb48 fix: support
searcherMetric
(#7511) - 525ebfe chore: consolidate k8s informers code & fix Makefile mocks (#7455)
- 5ef7d30 ci: put all GHA jobs for release tracking in a concurrency group (#7507)
- 47ac573 fix: Metrics with dot in name appear correctly in trial view [DET-9691] (#7450)
- 856a963 fix: k8s determined-container gets wrong RunAsUser (#7503)
- 18821cd ci: avoid latest responses==0.23.2 (#7501)
- c02fce2 docs: rp workspace mapping release notes (#7499)
- 33ab05a fix: typo that allows binding default aux pool (#7500)
- 4fde629 fix: avoid crashing the new exp page (#7489)
- 28953dc docs: FE-120: Add
job_history_enable = True
to PBS installation requirements (#7480) - fcfcbe6 docs: Add RP to Workspaces user guide (#7326)
- 707221c docs: Describe WebUI settings (#7478)
- 8366613 chore: add migration number validation to migration util (#7347)
- 57c3e8c test: skip
test_efficientdet_coco_pytorch_const
. (#7494) - 8471017 test: fix rbac test failures (#7476)
- aac4272 fix: stop showing invalid loading (#7470)
- 9327830 fix: ChartGrid styling (#7485)
- 7f35b7d fix: rp workspace mapping not working (#7490)
- d29d49d feat: backend support for inference metric tracking part 1 (#7375)
- db01c03 fix: chart tooltip overflow (#7484)
- 99ac604 fix: experiment list compare panel resize (#7477)
- b11d1cb fix: only let cluster admins manage resource pool bindings [WEB-1476] (#7483)
- 6f147fd docs: Add torch batch process example (#7482)
- b71e79a fix: rename
checkoutCount
tocheckpoints
(#7481) - ebac1d3 ci: Conda bump (#7479)
- 1e021bc ci(aws): fix RDS connections (#7475)
- 19945b6 ci: correctly change item status in release tracking (#7460)
0.23.4
Release Notes
Changelog
- f5484dd chore: bump version: 0.23.4-rc4 -> 0.23.4
- 9c8ea0c docs: add release notes for 0.23.4 (#7524)
- 91e6b82 chore: bump version: 0.23.4-rc3 -> 0.23.4-rc4
- 8afa912 fix: rename checkpoints (#7513)
- d8d671b chore: bump version: 0.23.4-rc2 -> 0.23.4-rc3
- 6b26541 fix: unbumpenvs. (#7496)
- fdc9acf fix: rp-workspace mapping RP not found [WEB-1508] (#7514)
- 99f9cc3 fix: support
searcherMetric
(#7511) - 65f76b2 chore: bump version: 0.23.4-rc1 -> 0.23.4-rc2
- dc4472e fix: k8s determined-container gets wrong RunAsUser (#7503)
- 54b6602 ci: avoid latest responses==0.23.2 (#7501)
- ecbfe61 docs: rp workspace mapping release notes (#7499)
- 3131f43 fix: typo that allows binding default aux pool (#7500)
- 593ace0 fix: avoid crashing the new exp page (#7489)
- 557a272 docs: Add RP to Workspaces user guide (#7326)
- 7fcff5a test: skip
test_efficientdet_coco_pytorch_const
. (#7494) - 8648e05 test: fix rbac test failures (#7476)
- 17ac0d6 chore: bump version: 0.23.4-rc0 -> 0.23.4-rc1
- 8db5076 fix: rp workspace mapping not working (#7490)
- ecaad56 fix: rename
checkoutCount
tocheckpoints
(#7481) - ed19b69 ci: Conda bump (#7479)
- 66f67f9 ci(aws): fix RDS connections (#7475)
- 2d64e16 chore: bump version: 0.23.4-dev0 -> 0.23.4-rc0
- 4dda861 feat: RP<>workspace mapping (#7461)
- 9bffd21 chore: add a toy experiment example for pushing generic metrics (#7442)
- 7e7ac8b fix: use friendly names for user settings (#7465)
- fda7235 chore: Turn new experiment listing off by default (#7466)
- e6cabd6 docs: rephrase the home page title (#7463)
- 4e9748f feat: Allow user setting of feature flags (#7438)
- 307e2ee chore: avoid postgres uuid extension (#7459)
- a66db23 chore: explist_v2 feature switch should default to on (#7249)
- 3ba9904 style: mobile version of exp list v2 [WEB-1422] (#7433)
- 6fe9218 test: mark RBAC-related tests 'e2e_cpu_rbac' (#7401)
- aafac46 fix: chart axis long label (#7451)
- 9e37ed4 docs: Reset home page tiles and sidebar (#7446)
- 8dc0cfd fix: wider width for top trials select (#7431)
- b67d981 ci: fix latest release branch selection (#7432)
- c915f84 chore: podspec-capability-bugfix (#7447)
- 2cc2656 feat: Pin column when compare panel open (#7414)
- 04485b2 chore: update docs hpe compliance web section id (#7449)
- 78d8baa feat: minor edit to torch batch process embedding example (#7440)
- 2ff3d2e chore: allow no pinned columns [WEB-1423] (#7419)
- 210c446 feat: sort resource pools by name when creating new jupyterlab sessions [WEB-1138] (#7444)
- 2e776a4 fix: chart tooltip has exp. name (#7410)
- bd470b6 feat: Merge data and row height menus in explist v2 (#7405)
- 9caaae8 docs: update aria labelling for sidebar toggles and toctree groups (#7381)
- 9de58bd fix: update get agents call for agent disable all (#7441)
- 2d3120d feat: master audit logging should log failed requests at
Info
level. (#7434) - 34347a1 feat: Compare icon reflects status (#7443)
- c409c31 style: apply hover color for selected table rows (#7429)
- 639e34a fix(cli): compound keys in --config opts for commands (#7439)
- 70e65df feat: allow-pachyderm-notebook-extension [DET-9355] (#7395)
- bfdccdf docs: Add documentation about passing in optional tensorboard arguments [MLG-333] (#7352)
- 07e12f7 fix: table column copy change (#7373)
- 8300458 chore: events & preemption listener actor refactor [DET-9617] (#7256)
- 497df17 feat: [MLG-647] Add batch inference examples using Core API and torch_batch_process (#7274)
- 119f577 chore: fail with non-zero exit code when password change fails (#7403)
- 06c7d0b feat: edit and reset raw user settings [WEB-1361] (#7377)
- f98d15a fix: perf improvement in
useTrialMetrics
(#7426) - 2ce5aaf fix: clear table selection when changing pages (#7418)
- 533f3a7 chore: migrate to singularity --nvccli [DET-9081] (#7337)
- 1002645 ci: fix e2e tests for tf keras cifar10 example (#7422)
- 71243e3 chore: agent provisioner refactor (#7287)
- d82f35f fix: Correctly sequence updates in useSettings bridge code (#7423)
- 0fc6238 feat: Add Bert embedding torch_batch_process example (#7402)
- 22f82a1 feat: obfuscates slot id in agent summary(#7421)
- 0855b78 feat: Settings drawer (#7356)
- 6da387b fix: fix: label default compute and default aux pools differently (FE-108) (#7420)
- e2112c9 chore: k8s pod & node informer refactor (#7182)
- c20ad5a fix: show loading state when applicable for comparison view tabs (#7370)
- 30a20ba fix: All metrics viewable in the glide table compare tab [WEB-1372] [WEB-1394] (#7412)
- 8ceb955 docs: remove RBAC limitations section (#7415)
- 403754b ci: warn about PRs that are from forks but by users with write access (#7380)
- ee68c3d perf: migrate checkpoint v1 into v2 table (#7325)
- 3e9eaed chore: skip process auth admin check if rbac is enable (#7400)
- af9d562 ci: make CASPER_TOKEN optional in
track-pr
script (#7379) - c510035 chore: bump version: 0.23.3-dev0 -> 0.23.4-dev0
- 37e6d9c docs: add release notes for 0.23.3 (#7413)
- b795a23 fix: custom column resize [WEB-1341] (#7365)
- 227a684 fix: replace div with table in Trial Comparison View (#7341)
- 59f34bd chore: update data loading in Keras CIFAR10 example (#7378)
- 9464a1f chore: change compare tab refresh behavior [WEB-1434] (#7397)
- f7a6ba4 chore: add release notes for notebook tls (#7408)
- 6ef9862 docs: hpe docs compliance for section 5 [WEB-1292] (#7383)
- 3718a4f chore: update left over api references to metric type (#7399)
- 4fdf5ae fix: userSettings store should function correctly when useSettings is active (#7409)
- 5ae6311 chore: Workspace as an object in the SDK. (#7387)
- 638f56e chore: add generic metrics support to harness (#7407)
- 60987a5 fix: refactor checkpoint modal so it closes correctly [WEB-1441] (#7398)
- fccd378 ci: Fix GKE cluster version used for CircleCI (#7385)
- 28f2c03 fix: add fp16 flags to hf ds examples (#7265)
- 2ac22f0 fix: install sigusr1 on main thread only. (#7350)
- 03c3faf fix: memoize settings in model detail and experiment detail pages (#7390)
- 056d414 chore: Clean up redundant aria-labels, correct wrong aria-labels [WEB-1379] (#7384)
- 1f5b9a0 fix: rbac filter on columns API (#7386)
- a0a87cd feat: add update master config API end point with RBACK and CLI command (#7318)
- 839dcbd chore: bump up flake8 to 3.9.2 (#7389)
- 7eaea60 chore: rename metric type concept to metric group (#7353)
- dd15be9 fix: add condition by default in filter (#7362)
- fe85366 fix: new exp detail crash [WEB-1444] (#7382)
- 4e80c72 fix: Metrics reporting no data (#7368)
- b9d820e fix: make
checkpoint_count
explicit (#7374) - 2d66b53 ci: fix some bugs in tracking of cherry-picked PRs (#7367)
- 13a6ad8 chore: clean up comments in trial and metric code (#7376)
- 5aed4c5 chore: update make_url utility with fallback (#7259)
- 03bb110 chore: use det custom json encoder in print_json (#7371)
- 4f92cab fix: use DEFAULT_COLUMNS in GroupManagement table (#7366)
- bea92ef feat: add training metrics to columns api (#7320)
- b2fadf1 fix: Trials use their own trial state enum [DET-9639] (#7354)
- 90257d6 feat: allow merging metrics with same batch number (#7304)
- 5f18070 ci(circle/test-unit): regroup gpu unit tests (#7345)
- c2a8113 fix: handle long values (#7349)
- 4c41432 docs: add HPE marketing analytics code (#7269)
- 223f28d fix: TypeError 'NoneType' object processing exp config (#7290)
- 43abf42 fix: Fix issue in user settings store (#7363)
- 1640d7a fix: reverting not-found-errs change in python user-groups (#7361)
- 6e0c2bf chore: bumpenvs (#7348)