Skip to content

Releases: ITM-Kitware/align-system


21 Nov 13:41
Choose a tag to compare



  • Updated Phase 1 experiment configs for final Phase 1 Eval delivery


18 Nov 12:57
Choose a tag to compare



  • Added Phase 1 Evaluation experiment configuration files
  • Added ICL example selection method that gives larger weight to examples with the same characetr ids as the current probe. To use set incontext.method to matching_characters.
  • Added ICL example selection method that gives larger weight to examples with the same action types as the current probe. To use set incontext.method to matching_actions.
  • Added retrieved ICL examples to input-output.json


14 Nov 00:36
Choose a tag to compare



  • Changed incontext normalization setting to be off (null/rawscores)
  • incontext.leave_one_out=false should now be configured as incontext.leave_one_out_strategy=null. Default behavior is no leave one out behavior.
    Previous incontext.leave_one_out=true should be specified as incontext.leave_one_out_strategy=scenario_description. Additionally, duplicate ICL examples,
    based on the chosen similiarity strategy, are now removed.
  • Changed training_session flag for TA3 interface from boolean to string (expecting "full" or "solo" or None)
  • Changed the comparative regression prompt to only include the structured chararcter information listed in relevant_structured_character_info in kdma_descriptions.yaml. To include all strucutured information that is unique across characters in the prompt (as was previously done automatically), specify relevant_structured_character_info = ['all_unique'].
  • Improved the QoL description and score_examples in kdma_descriptions.yaml
  • Changed default treatment parameter selection to use heuristic treatment options
  • Updated to transformers>=4.46.2 (and added necessary dependencies) to support newer models


  • Added an option for sorting incontext examples responses: incontext.sort_actions
  • Added character-based leave one out option: incontext.leave_one_out_strategy=characters
  • Phase 1 experiments directory
  • Added the option to filter out TAG CHARACTER responses by setting filter_tag_character to true
  • Added a history-based alignment function for scalar targets that uses distance to a running mean. To use specify inference_kwargs.distribution_matching as cumulative_average
  • Added the option to enumerate the valid regression scores in the json schema by specifying inference_kwargs.enum_scores as true. Valid score options for each KDMA are added to align_system/prompt_engineering/kdma_descriptions.yml.
    Valid score options may be specifed as a list via values, or a range specifed as dictionary of min (inclusive), max (inclusive), step
  • Added option to configure ICL example ordering: incontext.most_similar_first=true for the most similar ICL example first, false for most similar ICL example last.
  • Added the option to normalize KDE targets based on prior data. To use, set adm.inference_kwargs.kde_norm=priornorm and adm.inference_kwargs.priornorm_factor to the normalization weight you want (1 is fully normalized, 0 is no normalization or rawscores, default is 0.5.
  • Added KDMA scaling factor option. Scale factors for each KDMA are added to align_system/prompt_engineering/kdma_descriptions.yml
  • Added heuristic treatment options component


  • Fixed issue where choice history was persisting across scenarios -- supporting new optional method for ADMs reset_history called at the start of each new scenario


04 Nov 18:11
Choose a tag to compare



  • Moved incontext learning functionality into and updated the base outlines and comparative regession ADMS to use this module.
  • Moved the format_choices() function from the OutlinesTransformersADM class in to a new utils file: so it can be used across ADMs.
  • Update example_data/input_output_files to use DRE training scenarios
  • Changed default config to use outlines_transformers_structured_baseline (rather than the older single_kdma_baseline)
  • Adjusted choose_action() to enable returning an ADM-specific choice_info dictionary that is written to the resulting input_output.json file
  • When alignment target is optionally saved out in run_align_system save as JSON instead of YAML


  • Added option to normalize KDMA values in incontext examples
  • Added a probabilistic option to alignment utilities. Exposed this option in oracle, comparative regression, and
    hybrid regression ADMs.
  • Example config for deterministic outlines-based ADM runs (align_system/configs/experiment/examples/outlines_force_determinism.yaml). Requires setting force_determinsim to true and using greedy sampler.
  • Added a history-based/cumulative KDE option to alignment utilities. Exposed this option in oracle and comparative regression.
  • Added true and predicted KDMA values to the log and input_output.json file for comparative regression ADM.
  • Added Phase 1 eval alignment targets for SoarTech


  • Fixed KDE target samples to be between 0 and 1
  • Fixed issue in alignment_utils logging (where kdma values can be a float/int rather than a list)
  • Now properly hydrating the meta_info field of input_output files
  • Fixed possible divide by zero during misaligned alignment
  • Properly hydrate Aid list


  • Removed old and unused command-line interface scripts
  • Removed old template files for integrating custom ADMs
  • Removed CLI builder functionality
  • Removed old configuration files from before Hydra


27 Aug 20:00
Choose a tag to compare



  • Split out our experiment configuration for our aligned DRE ADM to specific configs for SoarTech and Adept
  • Added logging for sampled KDMA target value, and estimated KDMA values in alignment_utils


  • Fixed issue in Oracle ADM which caused an key error exception when logging probabilities


26 Aug 17:34
Choose a tag to compare



  • Updated Hybrid Kaleido ADM to optionally (on by default) use alignment_utils to support distribution based alignment
  • Refactored outlines_adm to break out action parameter completion into separate functions for reuse
  • Update README ADM invocation examples for the dry run evaluation (DRE)


  • Added support for 'precision' in model_kwargs for outlines based adms (expecting either 'full' or 'half')
  • Add option to save per scenario x alignment target unstructured outputs (useful for "eval" TA3 session types)
  • Added DRE experiment configurations


  • Fixed case in Kaleido ADM where choices weren't necessarily unique
  • In outlines_adm ensure that an already tagged character can't be selected again for the TAG_CHARACTER action
  • In outlines_adm ensure that already visited characters can't be selected again for assessment actions
  • In outlines_adm ensure MOVE_TO specifies character ID
  • In run_align_sytem CLI, don't allow unseen characters except for MOVE_TO and MOVE_TO_EVAC actions
  • Typo fix for Quality of Life KDMA description


21 Aug 15:08
Choose a tag to compare



  • Updated KDMA descriptions and made the KDMA description yml file configurable
  • No longer overwriting data when followup prompts are used in the Outlines ADM
  • Small updates to Outlines ADM to be compatible with API updates
  • Updated the oracle and comparative regression ADMs to use AlignmentFunction class
  • Updated comparative regression ADMs justification to use the best samples reasoning


  • Added incontext learning option for Outlines-based structured ADM
  • Added incontext learning option for Outlines-based regression ADM
  • Added alignment targets for ADEPT training scenarios for the dry run evaluation
  • Added comparative regression ADM which predicts KDMA scores for all responses simultaneously, enabling comparative reasoning
  • Added template option or kdma_score_examples for regression and comparative regression ADMs
  • Added incontext learning with chain of thought reasoning for regression and comparative regression ADMs
  • Added some Kaleido hybrid experiments for the ADEPT dry run scenarios
  • Added Persona based ADM from UCB (based off single kdma adm)
  • Added alignment targets for SoarTech scenarios for the dry run evaluation
  • Added some random ADM experiments for the SoarTech dry run scenarios
  • Added intend_action to the ActionBasedScenarioInterface to comply with TA3 server updates
  • Added functionality in the oracle and comparative regression ADMs for aligning to KDE targets
  • Added a misaligned option for the Oracle ADM using any alignment function
  • Added configuration option to record timing information about choose_action
  • Added a scenario description prompt which includes all unique structured character info
  • Added a hybrid regression approach for the Outlines ADM.


  • Fixed issue for running in batches with batch size in outlines ADMs
  • Fixed character selection to use the character_id associated with the selected action when available, otherwise send a follow up prompt
  • Restrict actions with pre-specified treatments when those supplies are not available


19 Jul 18:02
Choose a tag to compare



  • Now adding a random UUID suffix to the ADM name parameter when talking to the TA3 server to prevent session clobbering


  • Set a limit on the length of output strings in json schemas to avoid running into out of memory errors
  • Fixed issue with outlines ADM by catching when target KDMAs are not formatted as dictionaries as expected during eval sessions
  • Fixed issue with outlines ADM where responses weren't a list when only a single sample was requested
  • Fixed issue with outlines ADM during target KDMA conversion (should only run to_dict on KDMAValue objects)
  • Fixed a typo issue with outlines ADM where the positive system prompt was being used instead of the negative system prompt
  • Fixed issue with llama3 outlines ADM experiment files where the model wasn't being correctly set


  • Added new implementation of multi-KDMA ADM that regresses KDMA scores based on the outlines structure called outlines_regression_adm
  • Added regression prompts to align_system/prompt_engineering/
  • Added KDMA descriptions to align_system/prompt_engineering/kdma_descriptions.yml
  • Added new Outlines based structured ADM
  • Added outlines based prompts (in align_system/prompt_engineering/
  • Added dedicated function to utils for calculating votes (same voting scheme as the single KDMA ADM)
  • Added top level config options to force determinism and fix seeds; along with an example experiment to demonstrate
  • Added sampler parameter to outlines ADMs (example usage in align_system/configs/experiment/examples/outlines_sampler.yaml)
  • Added option (on by default) to outlines ADM to filter votes to positive options only, can disable on the command line with +adm.inference_kwargs.filter_votes_to_positives=False


  • The algorithm align_system/algorithms/ has been replaced by align_system/algorithms/
  • The functionality in align_system/algorithms/lib/chat/ is no longer being used
  • Files align_system/algorithms/lib/templates/ have been replaced by align_system/prompt_engineering/


25 Jun 16:15
Choose a tag to compare



  • (Major) Changed CLI configuration over to Hydra; recommend reading the updated README


  • Prevent ADMs from modifying original action objects


  • Added new Oracle ADM (action based; attempts to "choose" best action based on KDMA values)
  • Added new action based "Interface" for walking through Input Output JSON files
  • Added simple accuracy metrics to the input-output file interface
  • Added dedicated docs page for installing external (TA3, TA1s) services


24 Apr 18:14
Choose a tag to compare



  • Modified the prompt for PulseTaggingADM. Also removed duplicated inference call within identify_tag_color
    method. Additionally, removed duplicated RED tag in-context example and replaced with missing BLACK tag
  • Changed default maximization prompt for Kaleido


  • Applied attention fixes for Kaliedo provided by UWash
  • Fixed an "other choice" ordering issue in Kaleido ADM


  • Added an additional parsing guard in Llama2SinglaKDMAADM
  • Added do_sample as an init kwarg for Llama2SinglaKDMAADM (set to False for temperature 0)