Skip to content

andreaspung/interpreting-db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

interpreting-db

This is the Python code used in my Bachelor's Thesis "Interpreting a Convolutional Text Classification Neural Network on a Clinical Dataset". The main Jupyter notebook file is based on Ben Trevett's PyTorch sentiment analysis tutorial. The base code of the interpretation methods was given by my supervisor. The DementiaBank dataset is used in this analysis which cannot be shared publically.

Abstract

In this Bachelor’s Thesis, a convolutional text classification neural network is interpreted to find out why the neural network makes such predictions. To perform the analysis, the clinical DementiaBank dataset was used in which people with Alzheimer’s disease describe the Boston cookie theft image. The task of the binary classification was to identify based on the given text whether a person has Alzheimer’s or not. Interpretation methods described in Jacovi et al. (2018) were implemented. In addition to that, concrete examples of texts are interpreted in this thesis. Out of all the analyses performed, informative and uninformative ngrams and slot activation vectors with their clustering yield good results. Negative ngrams analysis results were substandard because of the specificity of the dataset.

Running the notebook

Open Anaconda prompt as an administrator from this project's root directory. Run the following commands:

conda create --name interpreting-db --file env.txt (64-bit Windows environment file)

activate interpreting-db

python -m spacy download en

jupyter notebook

from Jupyter notebook's user interface, you can just click on interpreting-db.ipynb file to open it.