Code to fully reproduce benchmark results (and to extend for your own purposes) from the paper: LIMESegment: Meaningful, Realistic Time Series Explanations
The goal of this package is to provide a modular way of adapting LIME to Time Series data. It provides methods for:
- Segmenting a time series
- Perturbing a time series
- Measuring similarity between time series
- Putting these all together to find explanations in the form 'This segment was most import for the overall classification'
- Clone repository
- Install dependencies 'pip3 install -r requirements.txt'
- Utils:
- data.py : Contains code generating all synthetic datasets used for experiments & loading UCR datasets used for expleriments
- models.py : Contains code generating and training all classification models used to test explanations:
- KNN: K Nearest Neighbour Algorithm implemented with Scikit learn
- CNN:1D Convolutional Neural Network implemented with keras
- LSTMFCN: State of the Art Hybrid LSTM and and LSTMFCN implemented by Karim et al. : https://github.com/titu1994/LSTM-FCN
- explanations.py: Contains code generating explanations:
- LIMEsegment: Our proposed adaptation of LIME to time series data
- Neves: The proposed adapatation of LIME to time series data of Neves et al.: https://boa.unimib.it/retrieve/handle/10281/324847/492202/Manuscript.pdf
- Leftist: The proposed adaptation of LIME to time series data of Guilleme et al.: https://ieeexplore.ieee.org/abstract/document/8995349
- Contains code for NNSegement, our proposed segmentation algorithm and is a building block of LIMESegment
- perturbations.py: Contains code for the perturbation strategies evaluated:
- RBP: Realistic Background Peturbation, a frequency based perturbation strategy proposed in the paper and is a building block of LIMESegment
- Zero, random noise and Gaussian blur perturbations
- metrics.py: Contains code for the Robustness and Faithfulness measures used to evaluate each explanation module
- constants.py: Contains details for loading and processing the UCR Time Series datasets used in experiments
- Experiments: Jupyter notebooks containing reproducable implementations of our paper experiments:
- Segmentation: Evaluate NNSegment against state-of-the-art segmentation approaches
- Background Perturb: Evaluate RBP against other peturbation strategies
- Locality: Evaluate the use of DTW in LIMESegment against the Euclidean distance measure
- RobustnessFaithfulness.py: Evalue LIMESegment against Neves and Leftist on overall explanations
- Data: Contains UCR datasets used in experiments. Note that the apnea dataset used in the Segmentation Tests is not included in this repo but can be downloaded from: https://www.physionet.org/content/apnea-ecg/1.0.0/)
- ts: TS array of shape T x 1 where T is length of time series
- model: Trained model on dataset array of shape n x T x 1
- model_type: String indicating if classificaton model produces binary output "class" or probability output "proba", default "class"
- distance: Distance metric to be used by LIMESegment default is 'dtw'
- window_size: Window size to be used by NNSegment default is T/5
- cp: Number of change points to be determinded by NNSegment default is 3
- f: Frequency parameter for RBP default is T/10
Run:
from Utils.explanations import LIMESegment
explanations = LIMESegment(ts, model, model_type, distance, window_size, cp, f)
Returns segment importance vector as returned by LIMESegment LIMESegment takes as input: