This is a tutorial for Constellate using Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF) from the Gensim library, which can be imported directly into Constellate. This tutorial focuses on comparing and contrasting two topic model algorithms and qualitatively analyzing them.
This tutorial is suitable both for independent learners interested in text analytics and digital humanities, as well as in a classroom or workshop with an instructor. Learners should have some base understanding of using Python and Jupyter notebooks, as well as knowledge of basic text analytics and interest in learning more. If you are not familiar with Python or text analytics such as tokenizing text and using regular expressions, Constellate has a wide variety of tutorials available here: Constellate Tutorials
Note that some code has been adapted from Nathan Kelber and Ted Lawless's Topic Modelling tutorial: Latent Dirichlet Allocation (LDA) Topic Modeling
If you are logged into Constellate, you may run this tutorial by clicking the "Launch in Constellate" button: