This project aims to develop a text classification model using data science techniques in Python to classify SMS messages as either spam or non-spam.
Screen 1 | Screen 2 |
---|---|
Screen 3 | Screen 4 |
---|---|
This project is a robust SMS spam classifier leveraging machine learning techniques to distinguish between spam and non-spam messages. It's designed to provide a reliable solution for identifying unwanted messages, allowing users to filter out potential spam.
-
Data Cleaning & Exploration: The project begins with data cleaning and exploratory data analysis (EDA) to understand the SMS dataset.
-
Text Preprocessing: Utilizing Natural Language Toolkit (NLTK), the text data undergoes preprocessing steps including tokenization, removal of stopwords and punctuation, and stemming to transform the text into a suitable format for machine learning.
-
Model Building & Evaluation: Various classification algorithms are employed and evaluated to create an accurate spam classifier. The model's performance is rigorously assessed to ensure its reliability in differentiating between spam and legitimate messages.
-
Streamlit-based Web Application: The project includes a user-friendly web interface created using Streamlit. Users can input a message, and the model predicts whether it's spam or not in real-time.
-
Deployment: Once the model is trained and validated, it's deployed and ready for use. The classifier can be accessed via a website, offering a practical solution for identifying spam messages.
- Input Message: Enter the SMS message in the provided text area.
- Prediction: Click the 'Check' button to trigger the classification process.
- Result Display: The system promptly displays whether the input message is identified as spam or not.
data
: Contains the dataset used for training and testing.model
: Stores the trained model and vectorizer.src
: Consists of the Python scripts used for data preprocessing, model building, and the Streamlit-based web application.requirements.txt
: Lists the dependencies required to run the project.
- Clone the repository.
- Install the necessary dependencies by running
pip install -r requirements.txt
. - Run the Streamlit app using
streamlit run app.py
. - Input an SMS message and check if it's classified as spam or not.
This project serves as an efficient tool to identify and filter out spam messages from SMS data, offering a practical solution for users seeking to manage unwanted content effectively.