By Cassandra, Nimmi, Yiming, and Jory.
This research was conducted as part of SENG 480A @ UVic (EMSE).
The included PDF presents the motivation, methodology, results, and conclusions of our work and findings.
Download the following packages needed for the included python modules and Jupyter notebooks:
pip install stackapi sklearn numpy nltk pandas seaborn wordcloud pyLDAvis
Alternatively, try
pip install -r requirements.txt
-
Use StackAPI to grab SO data.
a. Grab maximum questions & answers daily. Do over couple days.
b. Collate JSONs into single data file.
c. Remove duplicates
d. Format into input file for LDA.
-
Use LDA to process data.
- LDA does not label topics. This will need to be done manually.
-
Additional statistics on questions, answers, and users.