Skip to content

Supervised classification of textual reviews based on its sentiment into one of the five polarities ranging from strong negative to strong positive.

Notifications You must be signed in to change notification settings

akchi/Sentiment_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment Analysis

Supervised classification of textual reviews based on its sentiment into one of the five polarities:

  1. Strong negative
  2. Weak negative
  3. Neutral
  4. Weak Positive
  5. Strong Positive

Methodology

  1. Text Pre-processing: The raw data was processed to convert it into a format that can be used for further processing. The following steps were applied:
    • Case normalisation
    • Tokenisation
    • Lemmitization
  2. Feature Generation: Once the data was cleansed, relevant features were extracted from the it such as:
    • Creation of N-grams
    • Term and inverse document frequency
  3. Model : Logistic regression is the classifier used for determining the polarity of a review.

Datasets:

  1. train_data.csv:

    The training set consists of 650,000 product reviews.

  2. train_label.csv:

    This dataset contains the sentiment lables of the training dataset. The label set (1,2,3,4,5) refer to five polarity levels (strong negative, weak negative, neutral, weak positive, strong and positive) respectively.

  3. test_data.csv:

    The test set consists of 50,000 product reviews.

  4. predicted_label.csv:

    This dataset contains the predicted sentiment labels of the test data.

About

Supervised classification of textual reviews based on its sentiment into one of the five polarities ranging from strong negative to strong positive.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages