This is the source code for AcTracer: Active Testing of Large Language Model via Multi-Stage Sampling
├── balanced-kmeans
├── baseline_config
├── evaluation
├── kneed
├── our_method_config
├── state_collection # Core functionalities for collecting and analyzing internal states of LLMs
│ ├── base_store.py
│ ├── disk_store.py
│ ├── __init__.py
│ ├── reshape_activations.py
│ ├── state_collector.py # The StateCollector collects the intermediate states of LLMs.
│ ├── store_activation_hook.py
│ ├── tensor_store.py
│ └── tensor_types.py
├── strategy
│ ├── cluster_algo.py
│ ├── __init__.py
│ ├── k_center
│ ├── merge_result.py
│ ├── mmd_critic
│ ├── partition.py # Core functionalities for the proposed method AcTracer with class EmbedPartition.
│ ├── test_strategy.py # Baseline methods.
├── eval_partition_config.yaml # Example configurations for AcTracer
├── eval_partition.py # Evaluation scripts for AcTracer
├── evaluate_test_selection.py # Evaluation scripts for baseline methods
├── sub_sample_ablation.py # Evaluation scripts for ablation study
└── utils.py # Utility functions
Reference evaluation package: https://github.com/EleutherAI/lm-evaluation-harness
We utilized lm-evaluation-harness to evaluate the performance of LLM on the test set. Our evaluation example configurations are shown
in evaluation_configs folder.
The output format of the evaluation is as follows:
Output Format:
*dataset#1*.jsonl # The file containing inference results for each prompt
[
{
"doc_id": 0, # Number
"doc": {}, # Input question
"target": string, # ground truth
"arguments": [
[string], # The input prompt
{}, # Generation configuration,
]
"resps": [[string]], # Response
"filtered_resps": [string] # Filtered response
"metric": float # score
}
]
results.json #
{
"results": {
"dataset#1" : {},
"dataset#2" : {}, ...
},
"configs": {
"dataset#1" : {"target_delimiter": string, ...},
"dataset#2" : {"target_delimiter": string, ...},
....
}
}
We share our experiment results with different sampling rates on each of the methods at: https://drive.google.com/drive/folders/1xcmGgqeQjdNUuKux6iu4JK6YKNJ4uaJa?usp=sharing
SparseAutoencoder: https://github.com/ai-safety-foundation/sparse_autoencoder
TransformerLens: https://github.com/neelnanda-io/TransformerLens
Kneed: https://github.com/arvkevi/kneed
Balanced-Kmeans: https://github.com/kernelmachine/balanced-kmeans/tree/main
lm-evaluation-harness: https://github.com/EleutherAI/lm-evaluation-harness
[FSE'19] Boosting Operational DNN Testing Efficiency through Conditioning
[TOSEM'20] Practical Accuracy Estimation for Efficient Deep Neural Network Testing