GitHub - paigecaskey/SentimentAnalysis: This project involves performing sentiment analysis on customer book reviews. The steps include data preprocessing, vectorization using TF-IDF, and modeling using logistic regression, random forest, and XGBoost classifiers.

Book Review Sentiment Analysis

This is a machine learning project that performs sentiment analysis on customer reviews. The project includes data preprocessing, vectorization using TF-IDF, and modeling using logistic regression, random forest, and XGBoost classifiers. The logistic regression model achieved the highest performance with an AUC score of 0.878 and an accuracy of 78.4%.

Features

Preprocess customer reviews by removing duplicates, converting text to lowercase, removing punctuation, and applying lemmatization and stemming
Vectorize textual data using TF-IDF
Train and evaluate logistic regression, random forest, and XGBoost models
Perform cross-validation and hyperparameter tuning using GridSearchCV
Display model performance metrics including AUC score and accuracy

Technologies Used

Python
Scikit-learn
XGBoost
NLTK
Pandas
Jupyter Notebook

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
bookReviewsData.csv		bookReviewsData.csv
sentiment-analysis.ipynb		sentiment-analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Book Review Sentiment Analysis

Features

Technologies Used

About

Releases

Packages

Languages

paigecaskey/SentimentAnalysis

Folders and files

Latest commit

History

Repository files navigation

Book Review Sentiment Analysis

Features

Technologies Used

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages