Analysis and detection of short url spam on twitter. Achieved an accuracy of 89.23% on 100,000 tweets.
- Collecting 100,000 tweets containing bit.ly short url using Twitter API.
- Gathering meta-data about each short url using Bitly API.
- Storage of all information in MongoDB.
- Analysis of the information to discover significant patterns.
- Classification of short urls using [Weka] (http://www.cs.waikato.ac.nz/ml/weka/).