Measuring Opioid-Related Stigma using NLP

Udacity Machine Learning Engineer Nanodegree Capstone Project

This repository contains code and write-ups related to measuring opioid-related stigma using natural language processing tools. This project was submitted as my capstone project for the May 2018 Term of Udacity's Machine Learning Engineer Nanodegree.

Project Overview

In 2016, 42,249 Americans died as a result of an opioid overdose. To put that in perspective, that is more than the number of deaths in 2016 caused by firearms (38,658) or motor vehicle crashes (38,748). The epidemic has struck small towns and big cities; families struggling to make ends meet and families that seem like they're living the American dream. Certain parts of the country have been hit especially hard, including my native Kentucky.

The most effective known method for reducing overdose deaths is providing medication for addiction treatment (MAT) which has been shown in clinical studies to cut overdose deaths by 70 percent. However, the persistence of stigmatized views of addiction as a moral failing rather than a medical condition among policymakers and the general public has fueled resistance to efforts to expand access to MAT and discouraged individuals struggling with addiction from seeking treatment.

Despite its importance, very little data is currently available on opioid-related stigma. To fill this gap, this project uses a novel dataset of 149,000 opioid-related conversations scraped from Twitter between March and September 2018. To create labeled examples, 1,000 randomly selected tweets are manually coded for stigmatizing language. These examples are then used to train a Naïve Bayes classifier that automatically codes the remaining tweets to produce the first measure of how opioid-related stigma varies across the United States.

The full project report is available here.

Percent of Tweets Containing Stigmatizing Language by State

Link to interactive version

Replication

All libraries needed for replication are listed in environment/environment.yml.
All data needed for replication (145 MB) can be downloaded from this Dropbox link.
All code needed for replication is contained in code/opioid-stigma-nlp.ipynb.
Figures included in the report are in figures/.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
code		code
environment		environment
figures		figures
writeups		writeups
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Measuring Opioid-Related Stigma using NLP

Udacity Machine Learning Engineer Nanodegree Capstone Project

Project Overview

Percent of Tweets Containing Stigmatizing Language by State

Replication

About

Releases

Packages

Languages

mefryar/opioid-stigma-nlp

Folders and files

Latest commit

History

Repository files navigation

Measuring Opioid-Related Stigma using NLP

Udacity Machine Learning Engineer Nanodegree Capstone Project

Project Overview

Percent of Tweets Containing Stigmatizing Language by State

Replication

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages