A simple sandbox project

What you need to know

The VADER algorithm a rule-based algorithm that is both performant and light weight (no training data). For those advantages the algorithm is pretty effective and has been trained to work well on social media. This was done by using Amazon Mechanical Turk to label some dictionnaries and overall improve the quality of the training data set(more). I am purposely eliminating all tweet that are hard to geolocate but had to make some compromise to get enough data. I am working on UK and I am keeping tweets that are at least indicating the city they were sent from. When a tweet doesn't have a precise location I am using the boundingbox of the city to create a 'fake' location, that is to simulate a more realistic distribution of the data. I am also getting rid of tweets that are considered as "neutral" by the algorithm as they are of little interest to me in that exercise.

That is it! Have a look, play around, try to find some weird tendencies or just peek into what people have to say and where (hint: it was snowing in London when I started capturing the data).

About the stack

One Python backend script to stream (Tweepy) and analyse (VaderSentiment) tweets from Twitter API.They are then stored in a mongo database (Pymongo and Mongodb Atlas). This is to provide enough data to do further analysis, data viz and hopefully practice machine learning.

A Flask app deployed in Heroku is used to compute and forward the data to Leaflet, Plotly.js or any other nice lib in the future.

About the project

The project is inspired by one coding challenge from master Siraj Raval @sirajology on his youtube data science and machine learning series. It has come to be a fun project so I am trying to push it a bit further.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data_script		data_script
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A simple sandbox project

What you need to know

About the stack

About the project

About

Releases

Packages

Languages

License

DavidLacroix/sentimentmap

Folders and files

Latest commit

History

Repository files navigation

A simple sandbox project

What you need to know

About the stack

About the project

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages