Skip to content

This project involves performing sentiment analysis on customer book reviews. The steps include data preprocessing, vectorization using TF-IDF, and modeling using logistic regression, random forest, and XGBoost classifiers.

Notifications You must be signed in to change notification settings

paigecaskey/SentimentAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

9b32158 · Jul 31, 2024

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Book Review Sentiment Analysis

This is a machine learning project that performs sentiment analysis on customer reviews. The project includes data preprocessing, vectorization using TF-IDF, and modeling using logistic regression, random forest, and XGBoost classifiers. The logistic regression model achieved the highest performance with an AUC score of 0.878 and an accuracy of 78.4%.

Features

  • Preprocess customer reviews by removing duplicates, converting text to lowercase, removing punctuation, and applying lemmatization and stemming
  • Vectorize textual data using TF-IDF
  • Train and evaluate logistic regression, random forest, and XGBoost models
  • Perform cross-validation and hyperparameter tuning using GridSearchCV
  • Display model performance metrics including AUC score and accuracy

Technologies Used

  • Python
  • Scikit-learn
  • XGBoost
  • NLTK
  • Pandas
  • Jupyter Notebook

About

This project involves performing sentiment analysis on customer book reviews. The steps include data preprocessing, vectorization using TF-IDF, and modeling using logistic regression, random forest, and XGBoost classifiers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published