Email Spam Detection

➲ Project description

Email spam detection system is used to detect email spam using Machine Learning technique called Natural Language Processing and Python, where we have a dataset contain a lot of emails by extract important words and then use naive classifier we can detect if this email is spam or not.

➲ Prerequisites

This is list of required packages and modules for the project to be installed :

Python3
Pandas
Numpy
Scikit-learn
NLTK

Install all required packages :

 pip install -r requirements.txt

➲ The Dataset

Human activites dataset contain about 5728 record which is a sample of an email
and a target column "type" which describe the state of an email spam or not.

Dataset features and target :

➲ Coding Sections

In this part we will see the project code divided to sections as follows:

Section 1 | Data Preprocessing :
In this section we aim to do some operations on the dataset before training the model on it,
processes like :
- Load dataset
- Check for duplicates and remove them
- Check for missing data for each column
- Cleaning data from punctuation and stopwords and then tokenizing it into words (tokens)
- Convert the text into a matrix of token counts
- Split the data into training and testing sets
Section 2 | Model Creation :
The dataset is ready for training, so we create a K-nearest Neighbors "KNN" model using scikit-learn and thin fit it to the data.
Section 3 | Model Evaluation :
Finally we evaluate the model by getting accuracy, classification report and confusion matrix.

➲ Installation

Clone the repo

git clone https://github.com/theritik01/Suspicious-Email-Detection.git

Run the code from cmd
```
python email_spam_detection.py
```

➲ Output

Now let's see the project output after running the code :

Dataset head :

Dataset missing data count :

Dataset after cleaning puncituations and tokenizing text :

Classification report, confusion matrix and accuracy :

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
email_spam_detection.py		email_spam_detection.py
emails.csv		emails.csv
requirements.txt		requirements.txt
vector.pickel		vector.pickel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Email Spam Detection

➲ Project description

➲ Prerequisites

➲ The Dataset

➲ Coding Sections

➲ Installation

➲ Output

About

Releases

Packages

Languages

theritik01/Suspicious-Email-Detection

Folders and files

Latest commit

History

Repository files navigation

Email Spam Detection

➲ Project description

➲ Prerequisites

➲ The Dataset

➲ Coding Sections

➲ Installation

➲ Output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages