Explaining Machine Learning Classifiers using LIME

This repository contains two notebooks, each containing a machine learning project:

YouTube spam filter: develops an ML model for Tubespam dataset which can be found on the link: YouTube Spam Collection Data Set. The model used for classification is AdaBoost.
Sentiment analysis: develops an ML model for Multi-Domain Sentiment Dataset (version 2.0) which can be found on the link: Multi-Domain Sentiment Dataset. The model used for classification is Random Forrest Classifier.

Both classifiers are explained using LIME. Lime is based on the work presented in this paper (bibtex here for citation). Lime is able to explain any black box classifier, with two or more classes. All we require is that the classifier implements a function that takes in raw text or a numpy array and outputs a probability for each class. Support for scikit-learn classifiers is built-in.

What are explanations?

Intuitively, an explanation is a local linear approximation of the model's behaviour. While the model may be very complex globally, it is easier to approximate it around the vicinity of a particular instance. While treating the model as a black box, we perturb the instance we want to explain and learn a sparse linear model around it, as an explanation. This repository also contains a summary for the LIME explainer.

References

Alberto, T.C., Lochter J.V., Almeida, T.A. TubeSpam: Comment Spam Filtering on YouTube. Proceedings of the 14th IEEE International Conference on Machine Learning and Applications (ICMLA'15), 1-6, Miami, FL, USA, December, 2015.

T.A. ALMEIDA, T.P. SILVA, I. SANTOS and J.M. GOMEZ HIDALGO. Text Normalization and Semantic Indexing to Enhance Instant Messaging and SMS Spam Filtering. Knowledge-Based Systems, Elsevier, 108(2016), 25-32, 2016.

John Blitzer, Mark Dredze, Fernando Pereira. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. Association of Computational Linguistics (ACL), 2007.

arXiv:1602.04938 [cs.LG]

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
lime_summary.pdf		lime_summary.pdf
multi_polarity_books.ipynb		multi_polarity_books.ipynb
tubespam.ipynb		tubespam.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Explaining Machine Learning Classifiers using LIME

What are explanations?

References

About

Languages

tindiz/machine-learning-with-lime-explainer

Folders and files

Latest commit

History

Repository files navigation

Explaining Machine Learning Classifiers using LIME

What are explanations?

References

About

Topics

Resources

Stars

Watchers

Forks

Languages