SMS Spam Classification using Neural Networks

This project implements a neural network-based SMS spam classification system using TensorFlow and Keras. The goal is to accurately classify SMS messages as either "ham" (normal) or "spam" (unwanted advertisements or messages from companies).

Project Overview

The project follows these main steps:

Data Preprocessing
Model Creation
Model Training
Message Prediction

Data Preprocessing

The dataset is loaded from TSV files containing labeled SMS messages.
Text data is cleaned and tokenized using NLTK.
Messages are converted to sequences of word indices and padded to ensure uniform length.
Labels are converted to numerical values (0 for ham, 1 for spam).

Key concepts:

Tokenization
Sequence padding
Text normalization

Model Creation

A neural network is built using TensorFlow/Keras with the following architecture:

Embedding layer for learning word representations
LSTM layers for sequence processing
Dense layers with ReLU activation
Dropout for regularization
Final Dense layer with sigmoid activation for binary classification

Key concepts:

Word embeddings
Recurrent Neural Networks (LSTM)
Dropout regularization

Model Training

Data is split into training and validation sets.
The model is trained using binary cross-entropy loss and Adam optimizer.
Class weights are applied to handle potential class imbalance.
Training progress is monitored using accuracy and loss metrics.

Key concepts:

Train-validation split
Loss functions
Optimization algorithms
Class weighting

Message Prediction

A predict_message function is implemented to:

Preprocess input messages
Use the trained model for prediction
Return the spam probability and corresponding label

Evaluation

The model is evaluated on a test set, and performance metrics such as accuracy, precision, recall, and F1-score can be calculated.

Potential Improvements

Experiment with different model architectures (e.g., CNN, Transformer)
Use advanced text preprocessing techniques (lemmatization, stemming)
Implement data augmentation for text data
Explore ensemble methods

This project demonstrates the application of natural language processing and deep learning techniques to solve a real-world problem of SMS spam classification.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
Approach.md		Approach.md
NeuralNetworkSMSTextClassifier.code-workspace		NeuralNetworkSMSTextClassifier.code-workspace
README.md		README.md
fcc_sms_text_classification.ipynb		fcc_sms_text_classification.ipynb
train-data.tsv		train-data.tsv
valid-data.tsv		valid-data.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMS Spam Classification using Neural Networks

Project Overview

Data Preprocessing

Model Creation

Model Training

Message Prediction

Evaluation

Potential Improvements

About

Languages

dmickelson/NeuralNetworkSMSTextClassifier

Folders and files

Latest commit

History

Repository files navigation

SMS Spam Classification using Neural Networks

Project Overview

Data Preprocessing

Model Creation

Model Training

Message Prediction

Evaluation

Potential Improvements

About

Topics

Resources

Stars

Watchers

Forks

Languages