reddit-analytics analyzes popular posts from selected subreddits every 12 hours to product interesting data visualizations. Currently visualizations are limited to past 7 days.
Technologies used for this project are:
- Angular JS
- D3 js
- Node JS/Express JS for API & Query caching
- Python Cron Script for Reddit Scraping
- Rosette API for post to metadata conversion
- Sqlite db for storage.
- Create word cloud from most commonly used "entities". The color of text represents it's sentiment. Clicking on the word shows associated metadata and main reddit post.
- Categorize associated reddit post and represent them as a piechart. Clicking on each pie will display list of associated reddit posts.
- Amount of positive/neutral/negative posts for any subreddit over a period of week.
- Shows dominance of any particular topic in a subreddit.
- Categorize metadata of each subreddit into various buckets to see if they show any distinguishable pattern.
Feel free to contact me if you would like the whole database with converted metadata. (mail@ankitgyawali.com)
Possible expansion for this project could include: Viewing metadata by date, custom subreddit analysis option so it could be used by mods of various subreddits etc.
Basic steps to run:
- Clone the repository. Ensure you have python & node js installed on your system.
- Copy
sample-config.ini
toconfig.ini
inside server folder & configure with rosette api key & reddit script secret. - Install server requirements by running
pip install -r requirements.txt
from server folder. - Install node dependencies by running
npm install
from bothserver
&client
folder. - Iniitialize database by running
init.py
on server folder,python init.py
- Run server, by running
node server.js
ornpm run start
from server folder. Visualizations should for the day should now be accessible onlocalhost:3002
.
Report all issues related to reddit-analytics on this separate issue page.