diff --git a/CHANGELOG.md b/CHANGELOG.md index dd578723..ca6de97d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,16 +3,19 @@ This changelog follows the specifications detailed in: [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html), although we have not yet reached a `1.0.0` release. -## Unreleased +## 0.5.1 ### Changed * Updated Hybrid Kaleido ADM to optionally (on by default) use alignment_utils to support distribution based alignment * Refactored outlines_adm to break out action parameter completion into separate functions for reuse +* Update README ADM invocation examples for the dry run evaluation (DRE) ### Added * Added support for 'precision' in model_kwargs for outlines based adms (expecting either 'full' or 'half') +* Add option to save per scenario x alignment target unstructured outputs (useful for "eval" TA3 session types) +* Added DRE experiment configurations ### Fixed @@ -21,6 +24,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm * In outlines_adm ensure that already visited characters can't be selected again for assessment actions * In outlines_adm ensure MOVE_TO specifies character ID * In run_align_sytem CLI, don't allow unseen characters except for MOVE_TO and MOVE_TO_EVAC actions +* Typo fix for Quality of Life KDMA description ## 0.5.0 diff --git a/README.md b/README.md index 6fa8f655..b7b75f1e 100644 --- a/README.md +++ b/README.md @@ -112,28 +112,38 @@ captured in a new configuration file. We manage these experiments in delivered ADMs for the Metrics Evaluation (both to run on training data, and eval data). -## Metrics Evaluation ADM Invocations +## Dry Run Evaluation ADM Invocations -Note that to override the API endpoint for the metrics evaluation -ADMs, you can append `interface.api_endpoint='http://127.0.0.1:8080'` -to the command line arguments, setting the value to the correct URL. +We've specified Hydra experiments for the Dry Run Evaluation ADMs. +Note that by default these configurations attempt to connect to +`https://darpaitm.caci.com` as the TA3 API endpoint, but this can be +overridden with `interface.api_endpoint='http://127.0.0.1:8080'` on +the command line. + +### Random ADM + +(Good candidate for a smoketest) + +``` +run_align_system +experiment=dry_run_evaluation/random_eval_live +``` ### Baseline ADM ``` -run_align_system +experiment=metrics_refinement_evaluation/single_kdma_baseline_eval +run_align_system +experiment=dry_run_evaluation/outlines_baseline_eval_live ``` -### Aligned ADM 1 (Single KDMA ADM) +### Aligned ADM 1 (Comparative Regression + ICL + Template ADM) ``` -run_align_system +experiment=metrics_refinement_evaluation/single_kdma_aligned_eval +run_align_system +experiment=dry_run_evaluation/comparative_regression_icl_template_eval_live ``` -### Aligned ADM 2 (Hybrid Kaleido ADM) +### Aligned ADM 2 (Hybrid Regression ADM) ``` -run_align_system +experiment=metrics_refinement_evaluation/hybrid_kaleido_eval +run_align_system +experiment=dry_run_evaluation/hybrid_regression_eval_live ``` ## Implementing a new ADM diff --git a/pyproject.toml b/pyproject.toml index 7e2d9d3e..9bf9672b 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [tool.poetry] name = "align-system" -version = "0.5.0" +version = "0.5.1" description = "" authors = ["David Joy <10147749+dmjoy@users.noreply.github.com>"] readme = "README.md"