Skip to content

Releases: determined-ai/determined

0.16.0

15 Jun 00:27
Compare
Choose a tag to compare

Changelog

f5a590b chore: bump version: 0.16.0rc4 -> 0.16.0
091e039 docs: add release notes for 0.16.0 (#2575)
580e60b chore: bump version: 0.16.0rc3 -> 0.16.0rc4
be47a79 docs: update the JupyterLab bump release note (#2567)
e532761 fix: don't return dupes from det model list-versions (#2564) [DET-5640, DET-4248]
5fa3f22 chore: bump version: 0.16.0rc2 -> 0.16.0rc3
438b112 perf: optimizations to query batching fetch profiler metrics [DET-5637] (#2559)
11fff5b chore: bump version: 0.16.0rc1 -> 0.16.0rc2
ffe65cd fix: Change wording on modals that edit configs. (#2562)
89649c7 fix: set elastic ip domain to vpc in det deploy aws (#2557)
e48cd1d fix: dedup BindMounts and Devices on merge (#2560)
9938be7 fix: use model instead of schema struct for de-duping (#2545)
e0c8dec docs: extend docs for the client module (#2556)
2687f4d docs: add python sdk docs (#2547)
5977ce0 chore: also set cli_cert in dtrain worker processes (#2555)
d6edb9e chore: bump version: 0.16.0rc0 -> 0.16.0rc1
62e99c0 chore: fix typos (#2554)
2736600 chore: rename profiler tab in webui (#2551)
4cee9fa fix: Incorrect help link when profiles aren't enabled for a trial. [DET-5621] (#2549)
d623f5f chore: rename start_on_batch to begin_on_batch everywhere (#2553)
61be955 chore: revamp experiment and trial pages header [DET-5406] (#2456)
ee72cdc fix: add bumpenvs for tf-2.5 images. (#2552)
6010642 chore: bump version: 0.16.0.dev0 -> 0.16.0rc0
b910703 chore: lock api state for backward compatibility check
95f7d88 chore: bump version: 0.15.6.dev0 -> 0.16.0.dev0
d5145fe docs: Release notes for 0.15.6. (#2493)
068bb33 fix: prevent zoom reset if chart is already zoomed [DET-5514] (#2525)
3f44c83 fix: stop parsing notebook config on every edit [DET-5605] (#2528)
03b28be chore: fix client for new password handling (#2546)
fe05b0b chore: avoid defaulting to filter by current user [DET-5602] (#2540)
1e945af feat: expose a default Determined in det.experimental.client (#2532)
76230f8 chore: remove swagger-generated python code (#2541)
c7ac21d fix: password handling in python sdk. (#2543)
56dd19d feat: pull tensorboard images from experiment configs (#2544)
48ceaf2 fix: fix hparam string representation failure [DET-5616] (#2539)
8dfa088 feat: pull tensorboard images from experiment configs (#2534)
0ebeba3 chore: fix dropped cert argument in Authentication (#2542)
d0adc51 feat: multimaster Authentication objects [DET-5308] (#2531)
f1c9b1f feat: bump JupyterLab to 3.0.16 [DET-4872] (#2526)
12a8cae chore: bump default environment CPU and GPU images to tf-2.4 (#2523)
caf61c9 docs: add release notes for profiling features [DET-5351] (#2535)
deb4cbf chore: initialize cli_cert in e2e tests (#2530)
81eefc7 chore: bump transformers version for model-hub (#2522)
e9f5947 fix: add init_invalid_hp to master [DET-5569] (#2478)
ccdcaa8 chore: allow non-singleton Authentication (#2513)
0a887e9 fix: trial profiling system metric chart ignoring zero [DET-5505] (#2515)
0d9a540 fix: allow bumpenvs to update nvcr images in helm charts (#2520)
ec89928 feat: provide tensorflow 2.5 image [DET-5522] (#2517)
55c3353 docs: recommend users upgrade to 0.16.0 to avoid k8s master crashes (#2518)
a2f6fc2 chore: improved pynvml usage by profiler [DET-5394] (#2487)
a06d3a2 chore: minor edits to cli behaviors (#2519)
2316057 fix: add back bindmounts entry to command's default config (#2521)
3d34e1c fix: notebook modal improvements [DET-5599] (#2511)
6db8263 feat: add experiment notes & name [DET-5352] (#2307)
17976404 chore: update urllib3 (#2504)
49aec0d feat: support back-filling in the priority scheduler [DET-5397] (#2436)
aec1074 chore: handle error when loading notebook config (#2512)
09fca00 feat: add bind mounts to task container defaults [DET-5362] (#2516)
4068fde chore: collect prometheus metrics (#2501)
ed896c7 fix: python api create experiment bug (#2510)
0c9ec27 fix: avoid rc dev release mismatch notifications (#2405)
4fd3326 chore: task list filters [DET-5390] (#2466)
1f49553 test: add e2e tests for profiling features [DET-5245] (#2481)
ec9932d chore: upgrade ws to patch security vulnerability (#2505)
3857f94 chore: add experiment name to breadcrumb on trial detail page [DET-5284] (#2318)
87b1e59 docs: add release note for printable config (#2507)
212aa93 chore: disable profiling after restart [DET-5424] (#2486)
24432fe docs: add profiling how-to [DET-5209] (#2384)
2b04bf0 chore: fix TrialsSnapshotResponse comment typo (#2492)
1ca42b4 chore: fix TF version detection and RNG usage in test (#2500)
106294a chore: migrate away from spot checks and move towards waiting for an expected case (#2495)
eecc446 fix: generating printable master config does not alter original (#2502)
bf9b3ac fix: observability webui fixes [DET-5567][DET-5246][DET-5506][DET-5531][DET-5530][DET-5571] (#2488)
5b73278 chore: improve profiler throughput collectors (#2490)
55b122e chore: remove native init() functions [DET-5574] (#2480)
6ac0268 chore: add testing for eventually schema [DET-5560] (#2467)
6f86594 chore: remove trial old messages and consolidate others (#2464)
bae9c2d chore: fix some semi-broken unit tests (#2483)
3f9f2da fix: ship gpu_free_memory correctly [DET-5508] (#2497)
0dae801 chore: add non-streaming APIs for trial profiler endpoints (#2484)
0b0e9ca chore: update eslint-no-unused-vars to handle special cases (#2496)
d81f8ad fix: notebook modal bugs [DET-5573] (#2476)
8ee598d chore: improve performance of tfevent file filtering (#2469)
341fb4f chore: trim unused parts of rendezvous info (#2381)
ba07a04 chore: promote profiler APIs out of unimplemented (#2485) [DET-5587]
3f53289 fix: send all batches from harness profiler [DET-5566] (#2473)
c520187 chore: deprecate det.experimental.create_trial_instance() (#2479)
b0f57d6 fix: ProfilingAgent serializing timestamps incorrectly (#2482)
6a67383 fix: propagate slots when it is 0 (#2477)
4b97010 chore: measure profiler timings with time.time() (#2475)
2e38dfa chore: reword README for schemas (#2474)
3d6e73d fix: show x axis label on all plots [DET-5500] (#2471)
2e83f22 fix: make tf estimator dtrain work with tf 2.5 [DET-5563, DET-3762] (#2468)
f893eee fix: timing metric chart x-axis tick off [DET-5501] (#2472)
aa8d442 chore: log running of migrations (#2463)
36139a1 docs: add instructions to use dtrain workflow for inference with PyTorch (#2386)
66c6452 feat: hook ProfilerAgent into harness and add profiler timings [DET-5062, DET-5204] (#2348)
c52c616 chore: move run increment to allocation not termination [DET-5559, DET-5450] (#2462)
feac8cf feat: add launch notebook modal [DET-5376] [DET-5377] [DET-5380] [DET-5378] [DET-5379] [DET-5375] (#2398)
7c17856 chore: catch ruamel.yaml Duplicate Key Errors and format for users [DET-5542] (#2450)
2584c5b chore: rem to px [DET-5327] (#2433)
ddf8693 fix: allow custom registries with determined env images [DET-5556] (#2465)
8c1d0a9 fix: cleanup iter(DataLoader) before exiting [DET-5558] [DET-5554] (#2459)
2c3bfa3 fix: use user preferences when no search params are present (#2460)
80f4375 chore: disable dashboard recent tasks tests temporarily (#2461)
7f1c61d feat: det deploy --image-repo-prefix for pulling images from a custom docker repo (#2454)
a517040 fix: synchronize pods actor startup in k8s resource manager [DET-5536] (#2453)
ea4566f fix: update Buf image and CLI usage (#2455)
8092072 chore: bump buf and protoc version [DET-5534] (#2446)
92bf2c6 fix: prevent concurrent updates to a single expconf object [DET-5543] (#2451)
ea66301 revert added example model (tf classification) (#2452)
71a3502 fix: prevent spot resource pool contention [DET-5349] (#2423)
8def156 cli: small rewording in shell help (#2448)
bac3924 ci: regen buf image with buf 0.12.1 (#2447) [DET-5534]
193ac65 docs: fix broken links (#2439)
da7fe34 fix: introduce LegacyConfig for tensorboard and checkpoint gc [DET-5533] (#2444)
a9f0fe8 fix: omit internal fields in previewed notebook [DET-5523] (#2434)
a690381 fix: allow EOL searchers in configs only [DET-5526] (#2445)

Docker images

  • docker pull determinedai/determined-master:0.16.0
  • docker pull determinedai/determined-master:f5a590b8
  • docker pull determinedai/determined-master:f5a590b8e8b0f589f8086111c93a42f92760041c
  • docker pull determinedai/determined-dev:determined-master-f5a590b8
  • docker pull determinedai/determined-dev:determined-master-f5a590b8e8b0f589f8086111c93a42f92760041c
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.16.0
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:f5a590b8
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:f5a590b8e8b0f589f8086111c93a42f92760041c

0.15.6

02 Jun 22:36
Compare
Choose a tag to compare

Changelog

0c9ee55 chore: bump version: 0.15.6rc3 -> 0.15.6
2545084 docs: Release notes for 0.15.6. (#2493)
d7f41bc chore: bump version: 0.15.6rc2 -> 0.15.6rc3
bfad801 chore: move run increment to allocation not termination [DET-5559, DET-5450] (#2462)
fa03ce8 ci: regen buf image with buf 0.12.1 (#2447) [DET-5534]
0c348b8 chore: bump version: 0.15.6rc1 -> 0.15.6rc2
6546b65 chore: catch ruamel.yaml Duplicate Key Errors and format for users [DET-5542] (#2450)
4e3ade0 fix: allow custom registries with determined env images [DET-5556] (#2465)
730daeb fix: cleanup iter(DataLoader) before exiting [DET-5558] [DET-5554] (#2459)
d891b90 fix: synchronize pods actor startup in k8s resource manager [DET-5536] (#2453)
c2f08d2 fix: use user preferences when no search params are present (#2460)
ae8d54e revert added example model (tf classification) (#2452)
b898f11 fix: prevent spot resource pool contention [DET-5349] (#2423)
a7ce160 docs: fix broken links (#2439)
14300c4 cli: small rewording in shell help (#2448)
80973c1 chore: bump version: 0.15.6rc0 -> 0.15.6rc1
026a929 fix: allow EOL searchers in configs only [DET-5526] (#2445)
71166d1 fix: introduce LegacyConfig for tensorboard and checkpoint gc [DET-5533] (#2444)
ecf0b66 fix: omit internal fields in previewed notebook [DET-5523] (#2434)
396f492 chore: bump version: 0.15.6.dev0 -> 0.15.6rc0
016c33d chore: lock api state for backward compatibility check
cd5c939 fix: webui observability show chart only if metrics are available [DET-5418] (#2424)
6f67799 docs: notify users of coscheduler behavior [DET-5150] (#2442)
861c19a fix: resource pool not saved in the DB [DET-5485] (#2435)
343d810 chore: whitelist eventually from schema linter
25f6f3a feat: add eventually extension to schema [DET-5520] (#2432)
2f9dcff chore: Prevent _swagger from being formatted by make -C harness fmt (#2440)
4643748 docs: update procedure for latest NVIDIA drivers on GKE (#2429)
4f2a16f chore: minor copy fix for alert box spaces (#2438)
6e255d0 fix: make profiling schema more lenient [DET-5497] (#2409)
8c96541 chore: update OS and other language in Terraform modules [DET-4276] (#2415)
5c4b502 chore: reduce minimum char to fuzzy match for omnibar (#2430)
fe71847 fix: correct lint issues (#2437)
c7dcd41 chore: more eslint rules [DET-5513] (#2426)
d272e89 fix: observability webui widen dropdowns so the entire string is readable [DET-5503] (#2425)
55ff22e fix: fix convergence and distributed tests for tensorflow example (#2431)
c055571 docs: using det shell as a remote shell in IDEs. (#2428)
c71ff13 feat: omnibar initial support [DET-5374] (#2308)
57504fe chore: update default images (#2427)
4f7acab fix: merge logic for union-type configs [DET-5486] (#2410)
366b82b feat: det shell option to show ssh command for use in IDE [DET-5462] (#2407)
98b5a1e chore: table head style update (#2419)
e938ad1 chore: terminate /api/v1/trials/:id/avialable_series on trial termination (#2418) [DET-5499]
e3f4fb9 fix: improve tqdm rendering in the web ui (#2320)
2bf8534 Disable tests that will never pass on mac os x (#2417)
077bec8 docs: resource pool fixes (#2408)
8da96e9 fixed typo in custom custom docker configuration (#2413)
d7aa85f docs: update create_experiment (#2416)
0719c1d fix: fix an issue with parsing old exp config labels [DET-5487] (#2411)
6b260e7 chore: only return the port binding appropriate for the proxy [DET-5495] (#2401)
f9e099e docs: update parameter string (#2412)
d075392 feat: python-sdk [DET-5371] (#2317)
74b4e25 feat: add new multiclass text classification example for tensorflow [DET-5277] (#2396)
8e41491 feat: support more types of CPU instances on AWS [DET-4939] (#1907)
58b7e02 feat: experiment list search [DET-5460] (#2392)
66158f0 chore: update trial page overview layout [DET-5411] (#2389)
1d7f049 chore: upgrade timeago-react for react v17 (#2404)
ca4ab7a fix: correct resource pool pagination and make sort sticky [DET-5482] (#2403)
c0e23f4 chore: add zmq-based IPC to the DistributedContext (#2373)
3a625b4 chore: make pip happy again (#2399)
3b08403 fix: add max size limit metrics [DET-4878] [DET-4783] (#2387)
212dd09 chore: remove upstreamed gradient aggregation test (#2406)
8287dbe fix: correct the url search param setting for archived (#2402)
41aea03 docs: Release notes for 0.15.5. (#2397)
a6e3d80 chore: bump version: 0.15.5.dev0 -> 0.15.6.dev0
950a591 docs: deprecate old master configuration fields (#2395)
c0821a9 fix: wire up support for plain-string image config (#2393)
09e1dd8 ci: temporarily remove flaky tests (#2394)
ea3f932 chore: Edit docs for typos (#2391)
9918826 feat: support push metric APIs internally [DET-5215] (#2315)
7b62b33 chore: widen the trial link on experiment detail page [DET-5459] (#2390)
c8dd9fd feat: add SlotsPerAgent in resource pool API (#2383)
993a526 fix: Fix nightly gpu tests for pytorch word language model [DET-5226] (#2388)
fb60a58 chore: move trial logs in a trial detail page tab [DET-5410] (#2365)
0f3ba70 refactor: experiment list native filters [DET-5389] (#2378)
7b58a2d chore: move Trial Information table in a dedicated Trial page tab [DET-5434 (#2372)
e1e4b9e feat: Add PyTorch Word language Modeling example to Determined's Example [DET-5226] (#2352)
9e20c3c chore: unrevert and fix "actually use expconf in the master" (#2382)
8560a96 chore: remove unused protobuf imports (#2336)
f41e788 chore: update gke version (#2385)
1b134e3 chore: simplify tensorboard request msg (#2377)
30ab146 chore: close agents on websocket closures (#2380)
7d1509b docs: spelling fixes in model hub (#2379)
d0c6651 fix: det deploy gcp support for terraform 0.15 [DET-5449] (#2376)
cde5700 chore: revert "actually use expconf in the master" (#2375)
108462f chore: move trial hyperparameters in a dedicated trial page tab [DET-5412] (#2364)
3197cc3 build: remove webui and docs as direct master dependencies (#2363)
3a545cd feat: add a preview parameter to the notebook launch API (#2359)
fd145c9 chore: remove redundant model_hub line from bumpversion. (#2374)
1f573da docs: Release notes for 0.15.4. (#2370)
b7d3f2d chore: bump version: 0.15.4.dev0 -> 0.15.5.dev0
571f321 chore: actually use expconf in the master [DET-4885] [DET-4009] (#2310)
cd86fa2 fix: fix task pagination filters not taking effect [DET-5442] (#2367)

Docker images

  • docker pull determinedai/determined-master:0.15.6
  • docker pull determinedai/determined-master:0c9ee55c
  • docker pull determinedai/determined-master:0c9ee55c459e6407e0df60cf5db2805dc38865c5
  • docker pull determinedai/determined-dev:determined-master-0c9ee55c
  • docker pull determinedai/determined-dev:determined-master-0c9ee55c459e6407e0df60cf5db2805dc38865c5
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.15.6
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0c9ee55c
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0c9ee55c459e6407e0df60cf5db2805dc38865c5

0.15.5

19 May 01:17
Compare
Choose a tag to compare

Changelog

5fe959f chore: bump version: 0.15.5rc1 -> 0.15.5
eb1d821 docs: Release notes for 0.15.5. (#2397)
40640bc chore: bump version: 0.15.5rc0 -> 0.15.5rc1
a0f8ae6 docs: deprecate old master configuration fields (#2395)
7019b17 chore: bump version: 0.15.5.dev0 -> 0.15.5rc0
6c51726 chore: close agents on websocket closures (#2380)
124f04c chore: bump version: 0.15.4 -> 0.15.5.dev0

Docker images

  • docker pull determinedai/determined-master:0.15.5
  • docker pull determinedai/determined-master:5fe959f6
  • docker pull determinedai/determined-master:5fe959f61237b90b6af68999440fe6f52f734492
  • docker pull determinedai/determined-dev:determined-master-5fe959f6
  • docker pull determinedai/determined-dev:determined-master-5fe959f61237b90b6af68999440fe6f52f734492
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.15.5
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:5fe959f6
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:5fe959f61237b90b6af68999440fe6f52f734492

0.15.4

12 May 23:26
Compare
Choose a tag to compare

Changelog

8149595 chore: bump version: 0.15.4rc1 -> 0.15.4
664452a docs: Release notes for 0.15.4. (#2370)
a2f531c chore: bump version: 0.15.4rc0 -> 0.15.4rc1
6a70ea3 fix: fix task pagination filters not taking effect [DET-5442] (#2367)
cd6683e chore: bump version: 0.15.4.dev0 -> 0.15.4rc0
2926056 chore: bump version: 0.15.3.dev0 -> 0.15.4.dev0
0c7cbb2 chore: fix bumpversion config not properly bumping setup.py files (#2366)
c409fb9 Revert "chore: bump version: 0.15.3.dev0 -> 0.15.4.dev0"
6a2472e Revert "chore: fix missing version bump to 0.15.4.dev0"
cbc62bd chore: fix missing version bump to 0.15.4.dev0
61a4663 fix: profiles timings graph data conversion filling empty data [DET-5433] (#2360)
b520be8 chore: lock api state for backward compatibility check
08f857e chore: bump version: 0.15.3.dev0 -> 0.15.4.dev0
1d85f24 chore: add unit tests for webui util functions [DET-5323] (#2347)
984ad9d feat: add workload status to trial infobox [DET-4289] (#2349)
2afc2e1 fix: ci/cd model-hub tests config (#2358)
f3f828b chore: fixes eslint error (#2361)
2f363ac chore: reorder migrations (#2362)
4a66da3 refactor: delete some old commands APIs (#2321)
9b5bc16 fix: ci/cd e2e tests timeout (#2353)
e37029f test: always calling read() before calling wait() (#2356)
d8c8921 feat: store the original user submitted experiment config in db (#2332)
f677fa7 feat: support transformers library in model-hub [DET-4823, 4719, 4721, 4720] (#2068)
fc27d77 fix: improvements to automatic pod spec configurator (#2306)
f6f13dc fix: hide expected network errors when nodes are terminated [DET-4822] (#2351)
63c77f2 chore: Add output printing to debug flaky test (#2350)
0f3be84 chore: drop prior_batches_processed and num_batches (#2345) [DET-5403, DET-5405]
1161b66 fix: fix test test cluster setup cmd. (#2341)
9de56d6 chore: migrate to use total_batches more in HP search viz. (#2344)
cceb764 fix: react build should depend on its public dir (#2339)
0ddac86 chore: edits to expconf before enabling it (#2342)
cd2980b0 refactor: provide support for specifying selector for element id for element list (#2255)
dc229e5 feat: tolerate missing GPU stats when running under MIG [DET-5387] (#2327)
4160745 chore: disable webui experiment archive test (#2340)
8a0ced9 feat: add internal searcher APIs [DET-5214] (#2301)
cbafb09 feat: replace "show archived" toggle with dropdown [DET-3925] (#2333)
5952d37 fix: improve uPlot chart zooming experience [DET-5395] (#2338)
97ed53c chore: add searcher type to output of experiment APIs (#2328)
434578b docs: Release notes for 0.15.3 (#2334)
e77a16a chore: fix docstring (#2337)
5d60865 chore: add viewport meta to improve WebUI mobile experience [DET-5396] (#2335)
a0242e7 fix: system metric chart fix to support milliseconds [DET-5348] (#2311)
dca96c2 chore: sort nulls last in experiment trial API (#2329) [DET-5300]
cfb46ab chore: go mod fixes (#2325)
916b75c chore: only select a single host port per container rendezvous port (#2331)
aafdcf0 chore: update package json [DET-5335] (#2314)
20263dc chore: use filelocks to guard data download (#2244)
4f12a6f chore: loosen ruamel.yaml version (#2313)
7e6f51a revert: "revert: "fix: gracefully handle Docker binding published ports to ipv4 and ipv6 for host (#2259) [DET-5295]" (#2326)" (#2330)
1df87b2 docs: add a missing word in react readme (#2324)
953f528 Revert "fix: gracefully handle Docker binding published ports to ipv4 and ipv6 for host (#2259) [DET-5295]" (#2326)
2fa12d8 chore: update Docker images, AMIs and harness for yogadl update (#2319)
0d4cb14 ci: move CUDA 11 testing to more available GPUs (#2316)
98904a1 chore: bump version: 0.15.2.dev0 -> 0.15.3.dev0
a6bfd9a chore: log incorrect rendezvous addresses (#2312)
58f3b1b fix: expconf required fields (#2309)
ce99043 docs: remove stray text from task config reference (#2305)
6bf2e0c chore: move LearningCurveChart to use uPlot shared component [DET-5331] (#2302)
bbb5007 chore: update environment images to 0.12.0. (#2304)
cf8de24 feat: harness collects profiler metrics [DET-5061] (#2198)
b8f5ef6 chore: add internal preemption API [DET-5216] (#2260)
e75f386 docs: Release notes for 0.15.2 (#2303)
0b13fc5 refactor: code split libraries [DET-5342] (#2291)
6209a57 remove endtime as required from json-schema (#2298)
57fd334 chore: apply react strict mode and upgrade to React 17 [DET-5325] (#2279)
b07a086 chore: improve master logging (#2295)
d136dff chore: minor expconf issues (#2297)
1834bbb fix: scary warning with det shell open (#2299)
0126e21 chore: add support for building with Golang race detector (#2296)
c4592c4 docs: add release notes for preemption in k8s (#2294)
f5a4f8f chore: Helm notes should recognize preemption scheduler (#2293)
64826d2 fix: correct logic for hasData in uPlotChart (#2292)
8514007 chore: fix common fields on union types (#2270)

Docker images

  • docker pull determinedai/determined-master:0.15.4
  • docker pull determinedai/determined-master:81495950
  • docker pull determinedai/determined-master:8149595071d4efdd765b8965f9f7dee24900158f
  • docker pull determinedai/determined-dev:determined-master-81495950
  • docker pull determinedai/determined-dev:determined-master-8149595071d4efdd765b8965f9f7dee24900158f
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.15.4
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:81495950
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:8149595071d4efdd765b8965f9f7dee24900158f

0.15.3

06 May 00:10
Compare
Choose a tag to compare

Changelog

b42d42b chore: bump version: 0.15.3rc3 -> 0.15.3
380982a chore: bump version: 0.15.3rc2 -> 0.15.3rc3
b479d89 docs: Release notes for 0.15.3 (#2334)
07740ba chore: bump version: 0.15.3rc1 -> 0.15.3rc2
8ced4c3 chore: go mod fixes (#2325)
63e8e11 chore: only select a single host port per container rendezvous port (#2331)
6b33c76 chore: loosen ruamel.yaml version (#2313)
9bed90d chore: bump version: 0.15.3rc0 -> 0.15.3rc1
f188b8d chore: update Docker images, AMIs and harness for yogadl update (#2319)
1428672 chore: bump version: 0.15.3.dev0 -> 0.15.3rc0
b6d1bef chore: update environment images to 0.12.0. (#2304)
3807e7d chore: bump version: 0.15.2 -> 0.15.3.dev0

Docker images

  • docker pull determinedai/determined-master:0.15.3
  • docker pull determinedai/determined-master:b42d42bd
  • docker pull determinedai/determined-master:b42d42bdb1e66daadb0dc1a2dc8454b072bab774
  • docker pull determinedai/determined-dev:determined-master-b42d42bd
  • docker pull determinedai/determined-dev:determined-master-b42d42bdb1e66daadb0dc1a2dc8454b072bab774
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.15.3
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:b42d42bd
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:b42d42bdb1e66daadb0dc1a2dc8454b072bab774

0.15.2

30 Apr 05:12
Compare
Choose a tag to compare

Changelog

89f3ee0 chore: bump version: 0.15.2rc2 -> 0.15.2
565e187 docs: Release notes for 0.15.2 (#2303)
b0e43c1 chore: bump version: 0.15.2rc1 -> 0.15.2rc2
4527077 fix: scary warning with det shell open (#2299)
6022114 docs: add release notes for preemption in k8s (#2294)
56e87c2 fix: correct logic for hasData in uPlotChart (#2292)
c675141 chore: bump version: 0.15.2rc0 -> 0.15.2rc1
fdf5abc chore: bump version: 0.15.2.dev0 -> 0.15.2rc0
76fe5c8 fix: wait for uPlot to be ready to setData or setSize [DET-5343] (#2283)
c367fc7 feat: promote custom reducers from experimental [DET-5322] [DET-5321] (#2284)
f3636f3 docs: add docs for preemption in kubernetes (#2289)
88ca463 feat: allow activation of priority scheduler in k8s (#2288)
ae13bb6 fix: lr_scheduler step when using gradient_aggregation [DET-5289] (#2271)
c1fa923 feat: add support for preemption in Kubernetes [DET-5135] (#2282)
54ff4ae fix: only warn for non-numeric np.dtypes [DET-5288] (#2287)
1f3b5e0 chore: remove svg and ttf fonts (#2286)
1454a15 chore: drop unused compose component (#2278)
1dc19d7 fix: squelch "response already committed" master log message (#2281)
b3c33a5 docs: add missing line to docker run (#2285)
7200fdf feat: expose user id as part of the user object [DET-4856] (#2265)
bea7875 fix: add support for dynamic section content via css [DET-5299] (#2277)
e1f14ba fix: improve rendering for uPlot chart with empty data [DET-5330] (#2274)
3b79dc5 chore: upgrade to labstack/echo v4.2.2 (#2266)
7239b10 chore: add support of y-axis zooming for uPlot [DET-5266] (#2268)
4b62700 expconf: fix some minor bugs in reflect code (#2267)
4bf88ba chore: fix typo in help for "experiment download". (#2269)
6c68692 fix: gracefully handle Docker binding published ports to ipv4 and ipv6 for host (#2259) [DET-5295]
6f5b86d ci: enable taiko get elements logging to help debug the disconnect from this and actual elements (#2264)
24ca0b5 fix: ignore hp-importance as a requirement for displaying hp-viz (#2258)
438a07a fix: allow agents to be set to empty [DET-5296] (#2261)
56e6f0f chore: migrate webui to use /api/v1/auth/login [DET-5287] (#2254)
9712291 chore: replace metric chart with uplot [DET-4303] (#2234)
468a70f chore: clarify API for expconf objects (#2256)
524d3a3 chore: add option to login through the new api w/ pre-hashed pwd [DET-5270] (#2253)
d08a449 chore: remove validation operations [DET-5213] (#2189)
cd64901 chore: add compression middleware to echo (#2249)
f18cb7d docs: Release notes for 0.15.1 (#2245)
ea0b372 chore: bump version: 0.15.1.dev0 -> 0.15.2.dev0
3c2186e Fix: remove quotes for Terraform 0.13 (#2231)
d23ebd0 docs: missing service-linked role [DET-5253] (#2221)
3cf12f4 feat: support configurable port and container name for Fluent Bit [DET-5272, DET-5273] (#2251)
6827390 chore: set the image used in ptl amp test through set_tf2_image (#2194)
259cff6 chore: trigger hp importance work on exp completion (#2248)
fe1540d chore: no pointers to maps or slices in expconf (#2238)
bd509a4 fix: TFKerasTrial check for tf2 behavior on 2.2.0 [DET-5277] (#2246)
afdb749 feat: per-resource-pool configs [DET-5173] (#2214)
30a6af3 chore: reset error field on hp importance success (#2242)
6c47b6c style: transpose hp heatmap to better align the plot axes (#2232)
55cfe8f ci: fix windows test with lmdb 1.2 (#2241)
eb61693 fix: update label picker when labels change [DET-5254] (#2239)
6ab2268 chore: submit partial hp importance work to pool (#2240)
c2f96cb chore: allow dev lint errors (#2218)
8de19c0 chore: fix panic from dependency creation race (#2233)
782d095 fix: select snapshot version with snapshot (#2235) [DET-5264]
cac640b chore: up circle ci timeout for e2e tests (#2237)
0b2c2c7 fix: actually support add/drop capabilities in structs (#2236)
1c926e0 chore: fix panics in hp importance actor (#2230) [DET-5263]
e21dc31 feat: support healthchecks for det deploy aws with TLS enabled. (#2207)
88b8689 docs: Release notes for 0.15.0. (#2225)
cd39429 chore: bump version: 0.15.0.dev0 -> 0.15.1.dev0
2fbe7d5 chore: bump version: 0.14.7.dev0 -> 0.15.0.dev0
9ef41f4 fix: gcp quota checks, A100 and docs. (#2220)
875ceb2 fix: fetch agents only when authenticated [DET-5259] (#2228)
7393b4f fix: force checkpoint GC to use the master's default environment (#2229)
524c895 fix: allow scatter plot to re-render when data changes initially (#2224)
a68ef2e ci: move test_hp_importance_api to distributed tests (#2212)
77b2c47 ci: stop using coscheduler for CI testing (#2216)
19e6b3e fix: correct crash when changing filter on multi-selected rows [DET-5258] (#2226)
67de93a chore: final touches to ExperimentConfig V0 (#2142)
923c8ac docs: Update custom environment example [DET-5196] (#2171)
7be726d chore: validate grid list enum value from local storage (#2223)
11e117a fix: TFKerasTrial on tf2 with tf.compat.v1.disable_v2_behavior. (#2211)

Docker images

  • docker pull determinedai/determined-master:0.15.2
  • docker pull determinedai/determined-master:89f3ee04
  • docker pull determinedai/determined-master:89f3ee044b25619afd32e5faf62490c81c956837
  • docker pull determinedai/determined-dev:determined-master-89f3ee04
  • docker pull determinedai/determined-dev:determined-master-89f3ee044b25619afd32e5faf62490c81c956837
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.15.2
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:89f3ee04
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:89f3ee044b25619afd32e5faf62490c81c956837

0.15.1

19 Apr 21:56
Compare
Choose a tag to compare

Changelog

0e00289 chore: bump version: 0.15.1rc0 -> 0.15.1
a18a1eb Fix: remove quotes for Terraform 0.13 (#2231)
be490b5 chore: bump version: 0.15.1.dev0 -> 0.15.1rc0
5bc5826 chore: fix panic from dependency creation race (#2233)
2f73fbc chore: fix panics in hp importance actor (#2230) [DET-5263]
5dbee57 fix: select snapshot version with snapshot (#2235) [DET-5264]
922ff05 fix: TFKerasTrial on tf2 with tf.compat.v1.disable_v2_behavior. (#2211)
f8fd98d chore: bump version: 0.15.0 -> 0.15.1.dev0

Docker images

  • docker pull determinedai/determined-master:0.15.1
  • docker pull determinedai/determined-master:0e002898
  • docker pull determinedai/determined-master:0e002898037e6a58ec764e42d5f4a611c35a718b
  • docker pull determinedai/determined-dev:determined-master-0e002898
  • docker pull determinedai/determined-dev:determined-master-0e002898037e6a58ec764e42d5f4a611c35a718b
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.15.1
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0e002898
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0e002898037e6a58ec764e42d5f4a611c35a718b

0.15.0

14 Apr 23:18
Compare
Choose a tag to compare

Changelog

3a04e69 chore: bump version: 0.15.0rc1 -> 0.15.0
3fc0fa6 docs: Release notes for 0.15.0. (#2225)
b9d60b0 chore: bump version: 0.15.0rc0 -> 0.15.0rc1
e7576a7 fix: force checkpoint GC to use the master's default environment (#2229)
1fa876b fix: allow scatter plot to re-render when data changes initially (#2224)
6c1d013 chore: validate grid list enum value from local storage (#2223)
cdaff4f chore: bump version: 0.15.0.dev0 -> 0.15.0rc0
9328c90 chore: bump version: 0.14.7.dev0 -> 0.15.0.dev0
99147a3 chore: lock api state for backward compatibility check
418d49a chore: bump to 2 agents on latest master [DET-5241] (#2219)
35ae080 test: add more lr scheduler tests for lightning [DET-5223] (#2184)
93ae755 chore: stop using cloudpickle to write PyTorch checkpoints [DET-5175] (#2204)
631dce2 docs: Various fixes (#2209)
0397c1c feat: add git and ide content to detignore by default [DET-2832] (#2210)
52e5755 chore: pull EE CLI features and docs into OSS [DET-3912] (#2195)
40b414c feat: move executables to the main package and update docs. (#2187)
26404b8 chore: Update to Ubuntu 20.04 for agent, master and bastion images [DET-5238] (#2208)
154e1c6 docs: clarify k8s default pod spec behavior (#2197)
f091d5f chore: avoid rerendering experiment list if api response remains the same (#2203)
364c0bc chore: show trial metrics on webui [DET-5060] (#2167)
c648f15 chore: update estimators test fixture to not reference adaptive searcher (#2205)
225ccde chore: remove sha [DET-5225] (#2181)
469fb25 refactor: consolidate global contexts (#2186)
66000fc fix: disable det deploy wait for aws cluster on circleci. (#2192)
7de7441 chore: extend timeout on HP importance test (#2193)
3ccd0ca chore: make the first glasbey color our brand color (#2190)
5e68b75 feat: add key tracker (#2188)
376baa3 feat: health check master after cluster creation [DET-5183] (#2164)
5a35fc1 chore: enable hyperparameter importance computation by default (#2159)
6aa7a04 refactor: improve experiment terminal state [DET-5202] (#2179)
bf43f63 chore: remove stoksc from codeowners (#2185)
92056b5 fix: add efs, fsx, and govcloud templates to bumpversion [DET-5200] (#2172)
4cd0612 fix: mmdetection docker image to work with torch 1.7 (#2183)
d979092 feat: local clusters to store checkpoint data in home [DET-5154] (#2170)
19ea607 fix: bug in random search leading to incorrect total trials (#2182)
8127681 chore: idempotent searcher progress API [DET-5211] (#2180)
b60ac4c feat: zoomed modal charts [DET-5111] (#2174)
fa9c773 chore: make default goal should be build. (#2177)
d241fa9 docs: various fixes (#2163)
66f75d4 fix: allow telemetry to be disabled under Helm (#2178)
07ce81a fix: default tooltip prefix to be an empty string (#2165)
bb7f1ea chore: tweak step_lr param in e2e tests (#2160)
d96aced chore: add an example for checkpoint callbacks [DET-5186] (#2173)
ad3e0a4 refactor: update context api to reduce unnecessary re-renders [DET-5185] (#2168)
255fa74 chore: store the daily/monthly filter setting in local storage for cluster historical usage page [DET-5194] (#2161)
9e3c69d chore: wire up profiling configurations [DET-5064] (#2122)
1b164fa chore: bump version: 0.14.6.dev0 -> 0.14.7.dev0
8a458c1 chore: add tab navigation to trial details page [DET-5070] (#2162)
14e9911 feat: det deploy check for sufficient gpu quotas on aws, gcp. (#2136)
f0faf47 chore: webui for resource allocation data [DET-5046] (#2062)
c636b45 chore: update codeowners (#2145)
4b7ef37 fix: add terraform files into default detignore [DET-5155] (#2146)
3537c21 fix: get cli wheel back into trail runner. (#2156)
e9892dd fix: fix an issue in wrapping lr_scheduler for lightningadapter (#2154)
8adc237 fix: roll back det and det-deploy executable move. (#2153)
5c8a17f fix: avoid loop of effect in hp-viz when experiment is not supported [DET-5189] (#2151)
95a8638 fix: e2e test for pytorch lightning examples (#2152)
d6bccf6 fix: pytorch lightning example (#2150)
323f272 fix: avoid showing no-data message for a split second in hp-viz [DET-5099] (#2144)
957ffd6 chore: add license information to Pytorch Lightning examples (#2147)

Docker images

  • docker pull determinedai/determined-master:0.15.0
  • docker pull determinedai/determined-master:3a04e697
  • docker pull determinedai/determined-master:3a04e697706f25e6068b2bfe0f4ff3d9c8332ec9
  • docker pull determinedai/determined-dev:determined-master-3a04e697
  • docker pull determinedai/determined-dev:determined-master-3a04e697706f25e6068b2bfe0f4ff3d9c8332ec9
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.15.0
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:3a04e697
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:3a04e697706f25e6068b2bfe0f4ff3d9c8332ec9

0.14.6

02 Apr 01:36
Compare
Choose a tag to compare

Changelog

e472436 chore: bump version: 0.14.6rc4 -> 0.14.6
29d9988 docs: Release notes for 0.14.6. (#2158)
f7a4043 chore: bump version: 0.14.6rc3 -> 0.14.6rc4
2e8c364 fix: get cli wheel back into trail runner. (#2156)
f4b8e08 chore: bump version: 0.14.6rc2 -> 0.14.6rc3
0eaa03a fix: fix an issue in wrapping lr_scheduler for lightningadapter (#2154)
a451c2c fix: roll back det and det-deploy executable move. (#2153)
74b1b79 chore: bump version: 0.14.6rc1 -> 0.14.6rc2
8024fc5 fix: avoid loop of effect in hp-viz when experiment is not supported [DET-5189] (#2151)
05645af fix: e2e test for pytorch lightning examples (#2152)
78c7fe9 fix: pytorch lightning example (#2150)
9d5dff7 chore: add license information to Pytorch Lightning examples (#2147)
fb11f6b chore: bump version: 0.14.6rc0 -> 0.14.6rc1
6e22699 chore: bump version: 0.14.6.dev0 -> 0.14.6rc0
1722f28 feat: precision prop and amp support for lightning adapter [DET-5116] (#2127)
567e237 feat: add allocation aggregation by agent label and resource pool (#2141)
cd39ce7 chore: upgrade taiko [DET-5157] (#2134)
17f77ca perf: support max_concurrent_trials for random and grid search (#2137)
ae505e4 ci: adding coscheduler to static k8s test clusters, and test (#2139)
f5723da chore: move unets_tf_keras back to previous images (#2138)
ec8f02b chore: change output format of JSON aggregated resource data (#2129)
db16db4 style: trial log section filters [DET-5176] (#2133)
23b0945 fix: tweak aggregated resource allocation history endpoint (#2123)
b9d6b0c chore: downgrade dev pytorch package versions to 1.7.1 (#2135)
74be8a0 chore: Moving back to Python 3.7 and PyTorch 1.7.1 (#2132)
0b027d2 fix: package install order in requirements.txt (#2131)
169ad89 chore: ingest multiple batches to trial profiler metrics endpoint [DET-5178] (#2117)
eb45d53 fix: update scatter plot to support non-numeric values [DET-5110] (#2126)
8e13071 feat: rank hparams with hp importance [DET-5105] (#2086)
f172ed5 docs: improvements to spot instance and resource pool docs (#2113)
dc623d0 refactor: include det-deploy into det cli. [DET-5153] (#2124)
88044b0 chore: add webui tests lint step to CI (#2115)
59f48f3 feat: add pytorch checkpoint on load/save hooks [DET-5109] (#2118)
6f58a93 ci: put GPUs in a separate GKE node pool from master (#2120)
7e5c9c1 chore: remove cluster v1 page [DET-5163] (#2114)
6cb9445 chore: add server address to cli trial log download cmd [DET-5161] (#2116)
02598ce refactor: cleanup react hook dependencies [DET-5158, DET-5159, DET-5160] (#2112)
b0b03b2 style: update hp viz nav (#2093)
4b84772 refactor: combine common, cli, deploy into one python package. [DET-4756] (#2108)
b531bda build: local build improvements [DET-5118] (#2060)
202c485 feat: add trial profiler metrics APIs [DET-5065, DET-5059] (#2051)
fe87adc chore: add new ExperimentConfig objects (#2066)
3d2f54f chore: add frequency parameter to wrap_lr_scheduler [DET-5148] (#2087)
18c3994 chore: add pytorch-lightning to docs requirements (#2111)
60ab3d3 chore: Revert "Testing gang-scheduling [DET-5134]" (#2110)
52a7fb3 chore: update to new images including TensorFlow, PyTorch, Python and CUDA upgrades (#2074)
2c5beaa Testing gang-scheduling [DET-5134] (#2100)
08d9562 feat: expose resource allocation endpoints in CLI [DET-5045] (#2107)
c346301 feat: colorize output info of det-deploy [DET-4749] (#2102)
c42069d feat: add aggregated resource allocation endpoint and job [DET-5044] (#2085)
9116b36 docs: Release notes for 0.14.5. (#2098)
6470983 docs: Release notes for 0.14.4. (#2089)
726ae46 chore: bump version: 0.14.5.dev0 -> 0.14.6.dev0
83997e8 chore: bump version: 0.14.4.dev0 -> 0.14.5.dev0
4c59e05 fix: go mod tidy for releases (#2106)
4db2311 fix: master gen target (#2104)
8431039 fix: release helm chart (#2105)
78b9d94 docs: further improve k8s coscheduling docs (#2099)
b443bec fix: broken doc links [DET-5100] (#2101)
e9754de docs: add pytorch lightning adapter docs [DET-4800] (#2076)
620b5ac chore: add readmes for pl examples [DET-5149] (#2096)
2b28eedb docs: improve docs on k8s coscheduling plugin. (#2097)
cfab2e5 feat: add batch margins [DET-5073] (#2057)
1550417 feat: provide user hint on aws, gcp auth in det-deploy [DET-4846] (#2092)
4e4a874 fix: helm files naming consistency (#2095)
aefd9e3 feat: delete Tensorboards with delete API [DET-5143] (#2083)
ff192db fix: helm chart errors in absence of defaultScheduler value (#2094)
743f4c2 feat: priority-based gang-scheduling for k8s (#2091)
4daae95 chore: improve cors proxy [DET-5115] (#2047)
f03c581 docs: stopping-based variant of adaptive search (#2090)
f379582 feat: add DELETE /api/v1/experiments/:id API [DET-4022] (#2056)

Docker images

  • docker pull determinedai/determined-master:0.14.6
  • docker pull determinedai/determined-master:e4724367
  • docker pull determinedai/determined-master:e47243675c660cbd571c78333e7ceece5f1db447
  • docker pull determinedai/determined-dev:determined-master-e4724367
  • docker pull determinedai/determined-dev:determined-master-e47243675c660cbd571c78333e7ceece5f1db447
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.14.6
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:e4724367
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:e47243675c660cbd571c78333e7ceece5f1db447

0.14.5

19 Mar 02:58
Compare
Choose a tag to compare

Changelog

f16dc9f chore: bump version: 0.14.5rc1 -> 0.14.5
60c225e fix: go mod tidy for releases (#2106)
5335fe9 Revert "chore: bump version: 0.14.5rc1 -> 0.14.5"
b84046c chore: bump version: 0.14.5rc1 -> 0.14.5
448f602 Revert "chore: bump version: 0.14.5rc1 -> 0.14.5"
46813ed fix: master gen target (#2104)
9f86bb8 fix: release helm chart (#2105)
ff5303f chore: bump version: 0.14.5rc1 -> 0.14.5
a6f0bfd chore: bump version: 0.14.5rc0 -> 0.14.5rc1
2e2cceb Revert "chore: bump version: 0.14.5rc0 -> 0.14.5"
2d16658 docs: further improve k8s coscheduling docs (#2099)
c4508da fix: broken doc links [DET-5100] (#2101)
541e5ac chore: bump version: 0.14.5rc0 -> 0.14.5
493f037 docs: Release notes for 0.14.5. (#2098)
05012f2 chore: bump version: 0.14.5.dev0 -> 0.14.5rc0
0d861f0 feat: add batch margins [DET-5073] (#2057)
b57ca59 chore: bump version: 0.14.4 -> 0.14.5.dev0
c26615c chore: bump version: 0.14.4rc3 -> 0.14.4
515162c docs: Release notes for 0.14.4. (#2089)

Docker images

  • docker pull determinedai/determined-master:0.14.5
  • docker pull determinedai/determined-master:f16dc9f1
  • docker pull determinedai/determined-master:f16dc9f1191a6e9b1b5c992ac39c6761ed176e20
  • docker pull determinedai/determined-dev:determined-master-f16dc9f1
  • docker pull determinedai/determined-dev:determined-master-f16dc9f1191a6e9b1b5c992ac39c6761ed176e20
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.14.5
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:f16dc9f1
  • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:f16dc9f1191a6e9b1b5c992ac39c6761ed176e20