All notable changes to this project will be documented in this file.
- Support for Watson Assistant v2 API log extraction and analysis
- Support for Watson Assistant v2 API accuracy (blind test and standard test only)
- Optionally compare two separate blind test runs and compute comparison metrics with
compare_results.py
- Pass Watson Assistant system_settings to test workspaces
- validateWS.py is now moved to
log_analytics
folder. waObjects.py
is extracted from that module to provide workspace parsing logic, usable in other modules including analytics
- Log extraction and basic analysis capabilities in
log_analytics
folder.
- Configurable k-fold union output file name
- Support for testing Natural Language Classifier models
- Support for username/password authentication. (Watson APIs require API key authentication)
- Generates confusion matrix in k-fold and blind modes
- Default values for most configuration values. Configuration effort is greatly reduced.
- Utility for extracting conversation logs and utterances based on a given dialog node.
- Static analysis utilities for conversation dialog evaluation.
- Intent metrics include F1 score.
- Support for IAM API key
- Static analysis utilities for conversation dialog evaluation.
- Support optional partial credit intent tables
- Support local workspace JSON as training input
- Added intent metrics generation script
- Made service base URL configurable
- Refactored training input to only use workspace ID
- Test cases for three modes and all sub-scripts are provided along with testing resources
- Handling exception by cleaning out workspaces optionally after training phase.
- Exposed maximum test rate and weight mode parameters.
- Allowed intents with confidence less than 0.2 to be returned.
- Calculate precision using different weighting configuration.
- Highlight confidence threshold on the precision curve figure.
- Union all folds test output as
kfold-test-out-union.csv
.
- Intent description generation script is provided in subfolder.
- Reserved entities can be imported to workspaces.
- Coroutine-based concurrent testing is provided in sub-script.