Releases: ITM-Kitware/align-system
Releases · ITM-Kitware/align-system
0.5.6
0.5.5
0.5.5
Added
- Added Phase 1 Evaluation experiment configuration files
- Added ICL example selection method that gives larger weight to examples with the same characetr ids as the current probe. To use set
incontext.method
tomatching_characters
. - Added ICL example selection method that gives larger weight to examples with the same action types as the current probe. To use set
incontext.method
tomatching_actions
. - Added retrieved ICL examples to input-output.json
0.5.4
0.5.4
Changed
- Changed
incontext
normalization
setting to be off (null/rawscores) incontext.leave_one_out=false
should now be configured asincontext.leave_one_out_strategy=null
. Default behavior is no leave one out behavior.
Previousincontext.leave_one_out=true
should be specified asincontext.leave_one_out_strategy=scenario_description
. Additionally, duplicate ICL examples,
based on the chosen similiarity strategy, are now removed.- Changed
training_session
flag for TA3 interface from boolean to string (expecting "full" or "solo" or None) - Changed the comparative regression prompt to only include the structured chararcter information listed in
relevant_structured_character_info
inkdma_descriptions.yaml
. To include all strucutured information that is unique across characters in the prompt (as was previously done automatically), specifyrelevant_structured_character_info = ['all_unique']
. - Improved the QoL
description
andscore_examples
inkdma_descriptions.yaml
- Changed default treatment parameter selection to use heuristic treatment options
- Updated to transformers>=4.46.2 (and added necessary dependencies) to support newer models
Added
- Added an option for sorting incontext examples responses:
incontext.sort_actions
- Added character-based leave one out option:
incontext.leave_one_out_strategy=characters
- Phase 1 experiments directory
- Added the option to filter out TAG CHARACTER responses by setting
filter_tag_character
to true - Added a history-based alignment function for scalar targets that uses distance to a running mean. To use specify
inference_kwargs.distribution_matching
ascumulative_average
- Added the option to enumerate the valid regression scores in the json schema by specifying
inference_kwargs.enum_scores
as true. Valid score options for each KDMA are added toalign_system/prompt_engineering/kdma_descriptions.yml
.
Valid score options may be specifed as a list viavalues
, or arange
specifed as dictionary ofmin
(inclusive),max
(inclusive),step
- Added option to configure ICL example ordering:
incontext.most_similar_first=true
for the most similar ICL example first,false
for most similar ICL example last. - Added the option to normalize KDE targets based on prior data. To use, set
adm.inference_kwargs.kde_norm=priornorm
andadm.inference_kwargs.priornorm_factor
to the normalization weight you want (1 is fully normalized, 0 is no normalization orrawscores
, default is 0.5. - Added KDMA scaling factor option. Scale factors for each KDMA are added to
align_system/prompt_engineering/kdma_descriptions.yml
- Added heuristic treatment options component
Fixed
- Fixed issue where choice history was persisting across scenarios -- supporting new optional method for ADMs
reset_history
called at the start of each new scenario
0.5.3
0.5.3
Changed
- Moved incontext learning functionality into
incontext_utils.py
and updated the base outlines and comparative regession ADMS to use this module. - Moved the
format_choices()
function from theOutlinesTransformersADM
class inoutlines_adm.py
to a new utils file:adm_utils.py
so it can be used across ADMs. - Update example_data/input_output_files to use DRE training scenarios
- Changed default config to use
outlines_transformers_structured_baseline
(rather than the oldersingle_kdma_baseline
) - Adjusted
choose_action()
to enable returning an ADM-specificchoice_info
dictionary that is written to the resultinginput_output.json
file - When alignment target is optionally saved out in
run_align_system
save as JSON instead of YAML
Added
- Added option to normalize KDMA values in incontext examples
- Added a probabilistic option to alignment utilities. Exposed this option in oracle, comparative regression, and
hybrid regression ADMs. - Example config for deterministic outlines-based ADM runs (
align_system/configs/experiment/examples/outlines_force_determinism.yaml
). Requires settingforce_determinsim
to true and using greedy sampler. - Added a history-based/cumulative KDE option to alignment utilities. Exposed this option in oracle and comparative regression.
- Added true and predicted KDMA values to the log and
input_output.json
file for comparative regression ADM. - Added Phase 1 eval alignment targets for SoarTech
Fixed
- Fixed KDE target samples to be between 0 and 1
- Fixed issue in alignment_utils logging (where kdma values can be a float/int rather than a list)
- Now properly hydrating the meta_info field of input_output files
- Fixed possible divide by zero during misaligned alignment
- Properly hydrate Aid list
Deprecated
- Removed old and unused command-line interface scripts
- Removed old template files for integrating custom ADMs
- Removed CLI builder functionality
- Removed old configuration files from before Hydra
0.5.2
0.5.2
Added
- Split out our experiment configuration for our aligned DRE ADM to specific configs for SoarTech and Adept
- Added logging for sampled KDMA target value, and estimated KDMA values in alignment_utils
Fixed
- Fixed issue in Oracle ADM which caused an key error exception when logging probabilities
0.5.1
0.5.1
Changed
- Updated Hybrid Kaleido ADM to optionally (on by default) use alignment_utils to support distribution based alignment
- Refactored outlines_adm to break out action parameter completion into separate functions for reuse
- Update README ADM invocation examples for the dry run evaluation (DRE)
Added
- Added support for 'precision' in model_kwargs for outlines based adms (expecting either 'full' or 'half')
- Add option to save per scenario x alignment target unstructured outputs (useful for "eval" TA3 session types)
- Added DRE experiment configurations
Fixed
- Fixed case in Kaleido ADM where choices weren't necessarily unique
- In outlines_adm ensure that an already tagged character can't be selected again for the TAG_CHARACTER action
- In outlines_adm ensure that already visited characters can't be selected again for assessment actions
- In outlines_adm ensure MOVE_TO specifies character ID
- In run_align_sytem CLI, don't allow unseen characters except for MOVE_TO and MOVE_TO_EVAC actions
- Typo fix for Quality of Life KDMA description
0.5.0
0.5.0
Changed
- Updated KDMA descriptions and made the KDMA description yml file configurable
- No longer overwriting data when followup prompts are used in the Outlines ADM
- Small updates to Outlines ADM to be compatible with API updates
- Updated the oracle and comparative regression ADMs to use
AlignmentFunction
class - Updated comparative regression ADMs justification to use the best samples reasoning
Added
- Added incontext learning option for Outlines-based structured ADM
- Added incontext learning option for Outlines-based regression ADM
- Added alignment targets for ADEPT training scenarios for the dry run evaluation
- Added comparative regression ADM which predicts KDMA scores for all responses simultaneously, enabling comparative reasoning
- Added template option or
kdma_score_examples
for regression and comparative regression ADMs - Added incontext learning with chain of thought reasoning for regression and comparative regression ADMs
- Added some Kaleido hybrid experiments for the ADEPT dry run scenarios
- Added Persona based ADM from UCB (based off single kdma adm)
- Added alignment targets for SoarTech scenarios for the dry run evaluation
- Added some random ADM experiments for the SoarTech dry run scenarios
- Added
intend_action
to theActionBasedScenarioInterface
to comply with TA3 server updates - Added functionality in the oracle and comparative regression ADMs for aligning to KDE targets
- Added a misaligned option for the Oracle ADM using any alignment function
- Added configuration option to record timing information about
choose_action
- Added a scenario description prompt which includes all unique structured character info
- Added a hybrid regression approach for the Outlines ADM.
Fixed
- Fixed issue for running in batches with batch size in outlines ADMs
- Fixed character selection to use the
character_id
associated with the selected action when available, otherwise send a follow up prompt - Restrict actions with pre-specified treatments when those supplies are not available
0.4.1
0.4.1
Changed
- Now adding a random UUID suffix to the ADM name parameter when talking to the TA3 server to prevent session clobbering
Fixed
- Set a limit on the length of output strings in json schemas to avoid running into out of memory errors
- Fixed issue with outlines ADM by catching when target KDMAs are not formatted as dictionaries as expected during eval sessions
- Fixed issue with outlines ADM where responses weren't a list when only a single sample was requested
- Fixed issue with outlines ADM during target KDMA conversion (should only run to_dict on KDMAValue objects)
- Fixed a typo issue with outlines ADM where the positive system prompt was being used instead of the negative system prompt
- Fixed issue with llama3 outlines ADM experiment files where the model wasn't being correctly set
Added
- Added new implementation of multi-KDMA ADM that regresses KDMA scores based on the outlines structure called
outlines_regression_adm
- Added regression prompts to
align_system/prompt_engineering/outlines_prompts.py
- Added KDMA descriptions to
align_system/prompt_engineering/kdma_descriptions.yml
- Added new Outlines based structured ADM
- Added outlines based prompts (in
align_system/prompt_engineering/outlines_prompts.py
) - Added dedicated function to utils for calculating votes (same voting scheme as the single KDMA ADM)
- Added top level config options to force determinism and fix seeds; along with an example experiment to demonstrate
- Added sampler parameter to outlines ADMs (example usage in
align_system/configs/experiment/examples/outlines_sampler.yaml
) - Added option (on by default) to outlines ADM to filter votes to positive options only, can disable on the command line with
+adm.inference_kwargs.filter_votes_to_positives=False
Deprecated
- The algorithm
align_system/algorithms/chat_kdma_predicting_adm.py
has been replaced byalign_system/algorithms/outlines_regression_adm.py
- The functionality in
align_system/algorithms/lib/chat/
is no longer being used - Files
align_system/algorithms/lib/templates/
have been replaced byalign_system/prompt_engineering/
0.4.0
0.4.0
Changed
- (Major) Changed CLI configuration over to Hydra; recommend reading the updated README
Fixed
- Prevent ADMs from modifying original action objects
Added
- Added new Oracle ADM (action based; attempts to "choose" best action based on KDMA values)
- Added new action based "Interface" for walking through Input Output JSON files
- Added simple accuracy metrics to the input-output file interface
- Added dedicated docs page for installing external (TA3, TA1s) services
0.3.3
0.3.3
Changed
- Modified the prompt for PulseTaggingADM. Also removed duplicated inference call within
identify_tag_color
method. Additionally, removed duplicated RED tag in-context example and replaced with missing BLACK tag
example. - Changed default maximization prompt for Kaleido
Fixed
- Applied attention fixes for Kaliedo provided by UWash
- Fixed an "other choice" ordering issue in Kaleido ADM
Added
- Added an additional parsing guard in Llama2SinglaKDMAADM
- Added do_sample as an init kwarg for Llama2SinglaKDMAADM (set to False for temperature 0)