Skip to content

VibhaBelavadi/sentiment-analysis-using-nltk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment Analysis using NLTK:

Problem Statement:

  1. Perform sentiment analysis by applying Maximum Entropy Classification to movies review data.
  2. Observe the affect on accuracy by the discriminating features of stop words, punctuations, lemmatization and also the amount of training data fed.
  3. Perform analysis on the unbalanced collection – changing proportions of positive and negative samples in training data.

Case Studies:

The following case studies were proposed:

Case Study I:
Maximum entropy classification on a) RawData, b) With stop words, c) without punctuation, d) with lemmatization, for all the words assuming equal proportions of positive and negative examples

Case Study II:
Maximum entropy classification on a) RawData, b) With stop words, c) without punctuation, d) with lemmatization, for top 500 words assuming equal proportions of positive and negative examples

Case Study III:
Maximum entropy classification on a) RawData, b) With stop words, c) without punctuation, d) with lemmatization, for top 1000 words assuming equal proportions of positive and negative examples

Case Study IV:
Maximum entropy classification on a) RawData, b) With stop words, c) without punctuation, d) with lemmatization for all the words assuming unequal proportions of negative and positive examples

Case Study V:
Maximum entropy classification on a) RawData, b) With stop words, c) without punctuation, d) with lemmatization for all the words assuming only negative examples

About

Sentiment Analysis using NLTK: Maximum entropy classification

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages