Skip to content

Commit

Permalink
Update docs and changelog for 0.2.0 release
Browse files Browse the repository at this point in the history
  • Loading branch information
dmjoy committed Sep 18, 2023
1 parent 8d456f0 commit 4b2ef53
Show file tree
Hide file tree
Showing 3 changed files with 128 additions and 34 deletions.
12 changes: 11 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,26 @@ This changelog follows the specifications detailed in: [Keep a Changelog](https:
This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html), although we have not yet reached a `1.0.0` release.


## Unreleased
## Version 0.2.0

#### Added

* Added llama 2 chat action-based ADM (via new CLI script `run_action_based_chat_baseline`)

* Added llama-index falcon action-based ADM (via new CLI script `run_action_based_align_system`)

* Added support for CACI's new action-based TA3 interface; along with new action-based template CLI script

* Added support for new probe types "PatientOrdering", "SelectTag", and "SelectTreatment"

#### Changed

* Environment now expects Python version >=3.9 (rather than exactly 3.8)

* Deprecated support for old TA3 interface (code not fully removed yet)

* Updated several depedency versions

* Changed BERT implementation to `bert_score` package

#### Fixed
Expand Down
145 changes: 114 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,14 @@ The ALIGN System can interface with a few difference services provided
by other teams. These interfaces may require additional setup
assuming you need to run the services locally for testing / debugging.

#### TA3 API
#### TA3 Action-based API

The code for the TA3 service can be found at: [TA3 API
Repository](https://github.com/NextCenturyCorporation/itm-mvp).
The code for the TA3 Action-based service can be found at: [TA3 Evaluation Server API
Repository](https://github.com/NextCenturyCorporation/itm-evaluation-server).

You'll also need to install the client module that's included with
this repository, so ensure that you have this code cloned locally.
There's a corresponding client module: [TA3 Evaluation Client](https://github.com/NextCenturyCorporation/itm-evaluation-client)

Note that this client module isn't a required dependency for the ALIGN system code.

#### Soartech's TA1 API

Expand All @@ -39,42 +40,38 @@ This API provides alignment scores for answered probes and scenarios.
It's generally recommended to set up a virtual Python environment to neatly manage dependencies (e.g. using `venv` or `conda`). The `align-system` code can be installed as a Python module with `pip
install git+https://github.com/ITM-Kitware/align-system.git`.

## Running the system

In the Python environment you have set up, a CLI application called `run_align_system` should now be available. This single entrypoint supports interfacing with both local files on disk, and the TA3 web-based API. Running the script with `--help` shows which interfaces are available:
## Running the system against the TA3 action-based API

```
$ run_align_system --help
usage: run_align_system [-h] {TA3,LocalFiles,TA1Soartech,TA1Adept} ...
$ run_action_based_align_system --help
usage: run_action_based_align_system [-h] {TA3ActionBased} ...
ALIGN System CLI
ALIGN Action Based System CLI
positional arguments:
{TA3,LocalFiles,TA1Soartech,TA1Adept}
Select interface. Adding --help after interface selection will print interface and system specified arguments
TA3 Interface with CACI's TA3 web-based service
LocalFiles Interface with local scenario / probe JSON data on disk
TA1Soartech Interface with Soartech's TA1 web-based service
TA1Adept Interface with Adept's TA1 web-based service
{TA3ActionBased} Select interface. Adding --help after interface selection will print interface and system specified arguments
TA3ActionBased Interface with CACI's TA3 web-based service
optional arguments:
-h, --help show this help message and exit
options:
-h, --help show this help message and exit
```

Running `--help` after the selected interface prints the full set of options for the interface and system. E.g.:

```
$ run_align_system TA3 --help
usage: run_align_system TA3 [-h] [-u USERNAME] [-s SESSION_TYPE] [-e API_ENDPOINT] [-m MODEL] [-t] [-a ALGORITHM] [-A ALGORITHM_KWARGS] [--similarity-measure SIMILARITY_MEASURE]
$ run_action_based_align_system TA3ActionBased --help
usage: run_action_based_align_system TA3ActionBased [-h] [-u USERNAME] [-s SESSION_TYPE] [-e API_ENDPOINT] [--training-session] [-m MODEL] [-t] [-a ALGORITHM] [-A ALGORITHM_KWARGS]
[--similarity-measure SIMILARITY_MEASURE]
optional arguments:
options:
-h, --help show this help message and exit
-u USERNAME, --username USERNAME
ADM Username (provided to TA3 API server, default: "ALIGN-ADM")
-s SESSION_TYPE, --session-type SESSION_TYPE
TA3 API Session Type (default: "eval")
-e API_ENDPOINT, --api_endpoint API_ENDPOINT
Restful API endpoint for scenarios / probes (default: "http://127.0.0.1:8080")
--training-session Return training related information from API requests
-m MODEL, --model MODEL
LLM Baseline model to use
-t, --align-to-target
Expand All @@ -87,16 +84,70 @@ optional arguments:
Similarity measure to use (default: 'bert')
```

Here's an example invocation of the system using the TA3 interface:
Here's an example invocation of the system using the TA3 Action-based interface (assuming it's running locally on port `8080`):
```
$ run_align_system TA3 -s soartech --algorithm "llama_index" --model falcon --algorithm-kwargs '{"domain_docs_dir": "/data/shared/MVPData/DomainDocumentsPDF"}'
$ run_action_based_align_system TA3ActionBased \
-e "http://127.0.0.1:8080" \
--algorithm "llama_index" \
--model falcon \
-s soartech \
--algorithm-kwargs '{"domain_docs_dir": "/data/shared/MVPData/DomainDocumentsPDF"}'
```

*NOTE* - The first time you run the system it can take upwards of a
half-hour to download the LLM model (which is roughly 25GB).
Subsequent runs of the system should only take a few minutes as the
model is cached.


## Running the system against TA1 services or local files

In the Python environment you have set up, a CLI application called `run_align_system` should now be available. This single entrypoint supports interfacing with both local files on disk, and the TA3 web-based API. Running the script with `--help` shows which interfaces are available:

```
$ run_align_system --help
usage: run_align_system [-h] {TA1Soartech,LocalFiles,TA1Adept} ...
ALIGN System CLI
positional arguments:
{TA1Soartech,LocalFiles,TA1Adept}
Select interface. Adding --help after interface selection will print interface and system specified arguments
TA1Soartech Interface with Soartech's TA1 web-based service
LocalFiles Interface with local scenario / probe JSON data on disk
TA1Adept Interface with Adept's TA1 web-based service
options:
-h, --help show this help message and exit
```

Running `--help` after the selected interface prints the full set of options for the interface and system. E.g.:

```
$ run_align_system TA1Soartech --help
usage: run_align_system TA1Soartech [-h] [-s [SCENARIOS ...]] [--alignment-targets [ALIGNMENT_TARGETS ...]] [-e API_ENDPOINT] [-m MODEL] [-t] [-a ALGORITHM] [-A ALGORITHM_KWARGS] [--similarity-measure SIMILARITY_MEASURE]
options:
-h, --help show this help message and exit
-s [SCENARIOS ...], --scenarios [SCENARIOS ...]
Scenario IDs (default: 'kickoff-demo-scenario-1')
--alignment-targets [ALIGNMENT_TARGETS ...]
Alignment target IDs (default: 'kdma-alignment-target-1')
-e API_ENDPOINT, --api_endpoint API_ENDPOINT
Restful API endpoint for scenarios / probes (default: "http://127.0.0.1:8084")
-m MODEL, --model MODEL
LLM Baseline model to use
-t, --align-to-target
Align algorithm to target KDMAs
-a ALGORITHM, --algorithm ALGORITHM
Algorithm to use
-A ALGORITHM_KWARGS, --algorithm-kwargs ALGORITHM_KWARGS
JSON encoded dictionary of kwargs for algorithm initialization
--similarity-measure SIMILARITY_MEASURE
Similarity measure to use (default: 'bert')
```


### Example Data

We've included some example scenario, probe, and alignment target data for testing. These files can be found in the `example_data` directory. Here's an example system invocation with the provided example files:
Expand All @@ -114,35 +165,67 @@ run_align_system LocalFiles \

## ADM Invocations

### Simple Baseline ADM
### Simple Action-based Baseline ADM

Simple baseline (unaligned) system using the `falcon` model:
```
run_align_system TA3 \
run_action_based_align_system TA3ActionBased \
--algorithm "llama_index" \
--model falcon \
-s soartech \
--algorithm-kwargs '{"retrieval_enabled": false}' \
--algorithm "llama_index" \
--model falcon
```

### Simple Aligned ADM
### Simple Action-based Aligned ADM

Simple aligned system using the `falcon` model (requires domain document PDFs):
```
run_align_system TA3 \
run_action_based_align_system TA3ActionBased \
--algorithm "llama_index" \
--model falcon \
-s soartech \
--algorithm-kwargs '{"domain_docs_dir": "/path/to/DomainDocumentsPDF"}' \
--algorithm-kwargs '{"retrieval_enabled": false}' \
--algorithm "llama_index" \
--model falcon \
--align-to-target
```

### Action-based Chat Baseline ADM

Unaligned system using a Llama 2 chat model:

```
run_action_based_chat_baseline TA3ActionBased \
-s adept \
--model meta-llama/Llama-2-13b-chat-hf
```

### Action-based Chat Aligned ADM

Aligned system using a Llama 2 chat model:

```
run_action_based_chat_baseline TA3ActionBased \
-s adept \
--model meta-llama/Llama-2-13b-chat-hf \
--precision half \
--align-to-target
```


## System Requirements by Algorithm / Model

*Note: This table is a work-in-progress and will evolve as we add new
algorithms / models*

|Algorithm|Model|RAM|GPU Memory|Disk Space|
|---------|-----|---|----------|----------|
|llama_index|falcon|>32GB|~18GB|~32GB|
|Algorithm|Model|RAM|GPU Memory|Disk Space|Hugging Face Link|Notes|
|---------|-----|---|----------|----------|-----------------|-----|
|llama_index|tiiuae/falcon-7b-instruct|>32GB|~18GB|~13GB|https://huggingface.co/tiiuae/falcon-7b-instruct||
|llm_chat|Llama-2-7b-chat-hf|>32GB|~18GB|~13GB|https://huggingface.co/meta-llama/Llama-2-7b-chat-hf|Requires license agreement: https://ai.meta.com/llama/license/|
|llm_chat|Llama-2-13b-chat-hf|>48GB|~28GB|~25GB|https://huggingface.co/meta-llama/Llama-2-13b-chat-hf|Requires license agreement: https://ai.meta.com/llama/license/|


## Quicklinks
Expand Down
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "align-system"
version = "0.1.1"
version = "0.2.0"
description = ""
authors = ["David Joy <10147749+dmjoy@users.noreply.github.com>"]
readme = "README.md"
Expand All @@ -16,7 +16,7 @@ python = ">=3.9,<4.0"
torch = { version = "^2.0.1", source = "pytorch" }
transformers = "^4.33.1"
llama-index = "^0.8.21"
langchain = ">0.0.283"
langchain = "^0.0.283"
sentence-transformers = "^2.2.2"
accelerate = "^0.22.0"
pypdf = "^3.15.5"
Expand All @@ -29,6 +29,7 @@ bert-score = "^0.3.13"
run_align_system = 'align_system.cli.run_align_system:main'
run_action_based_align_system = 'align_system.cli.run_action_based_align_system:main'
run_chat_baseline = 'align_system.cli.run_chat_baseline:main'
run_action_based_chat_baseline = 'align_system.cli.run_action_based_chat_baseline:main'

[build-system]
requires = ["poetry-core"]
Expand Down

0 comments on commit 4b2ef53

Please sign in to comment.