This project focuses on classifying movie reviews from the IMDB dataset into either positive or negative sentiment categories. The classification is achieved using the Logistic Regression algorithm, with prior data preprocessing for optimal results.
Logistic Regression Algorithm
Preprocessing steps were applied to the dataset to enhance model performance. This includes tasks such as text cleaning, tokenization, and feature extraction.
- Model Accuracy: 89.40%
- Training Accuracy: 90.16%
- Testing Accuracy: 88.65%
The dataset used for this project is sourced from IMDB, containing a collection of movie reviews labeled with sentiment categories.
The dataset was divided into training and testing sets with a 70% training and 30% testing split.
- Clone the repository to your local machine.
- Ensure you have the required Libraries installed.
- Run the main script to train and test the model.
- You can also use the trained model for your own classification tasks.
This project is licensed under the MIT License.