Sentiment analysis on IMDB movie reviews
The training set contains [pos/, neg/] directories for the reviews with binary labels positive and negative. Within these directories, reviews are stored in text files named following the convention [[id]_[rating].txt] where [id] is a unique id and [rating] is the star rating for that review on a 1-10 scale. For example, the file [train/pos/200_8.txt] is the text for a positive-labeled train set example with unique id 200 and star rating 8/10 from IMDb.