GitHub - sjwhitmore/tweet-collector

Tweet Collector

Initial stages -- just collects tweets pertaining to a certain topic and stores them in a MongoDB.

Next steps -- do some cool analysis.

Run by entering "node testnodetwitter.js" on the command line.

Remove words beginning with "@"" (mentions) and URLs, delete "#" from hashtags
Use emoticon dict to link emoticons with various levels of sentiment (http://en.wikipedia.org/wiki/List of emoticons)
Use abbreviation dict to replace words like "lol" and "gr8" with their written out versions (http://noslang.com)
Filter out "stop words" (those commonly ignored by search engines) (http://www.webconfs.com/stop-words.php)
Replace words with repeating character sequences with 3 charactes: i.e. "coooooool" to "coool" to standardize yet also retain emphasis.
Link negation words with the words they follow... i.e "isn't good" should mean that "good" is replaced by "NOT_good"
run the

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
index.js		index.js
package.json		package.json
router.js		router.js
server.js		server.js
slangdict.py		slangdict.py
testnodetwitter.js		testnodetwitter.js
tweetparser.py		tweetparser.py

Provide feedback