Skip to content

A data-driven study into developer interactions on Stack Overflow. Who uses Stack Overflow, and what are users currently talking about? Included is our paper which discusses our findings.

Notifications You must be signed in to change notification settings

JoryAnderson/EMSE_DevInt

Repository files navigation

EMSE_DevInt

By Cassandra, Nimmi, Yiming, and Jory.

This research was conducted as part of SENG 480A @ UVic (EMSE).

The included PDF presents the motivation, methodology, results, and conclusions of our work and findings.

Dependencies

Download the following packages needed for the included python modules and Jupyter notebooks:

pip install stackapi sklearn numpy nltk pandas seaborn wordcloud pyLDAvis

Alternatively, try

pip install -r requirements.txt

(Rough) Procedural Overview

  1. Use StackAPI to grab SO data.

    a. Grab maximum questions & answers daily. Do over couple days.

    b. Collate JSONs into single data file.

    c. Remove duplicates

    d. Format into input file for LDA.

  2. Use LDA to process data.

    • LDA does not label topics. This will need to be done manually.
  3. Additional statistics on questions, answers, and users.

Usage

Ad-Hoc Python Scripts

Grabbing Data

Resources

StackAPI

JGibbLabeledLDA

Refactored JGibbLabeledLDA

Preprocess

LDA

About

A data-driven study into developer interactions on Stack Overflow. Who uses Stack Overflow, and what are users currently talking about? Included is our paper which discusses our findings.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •