Skip to content

Repository dedicated to Term Project of UofT Statistics for Data Science Course

License

Notifications You must be signed in to change notification settings

quickheaven/scs-3251-statistics-for-data-science

Repository files navigation

Spam Email Classification

SCS 3251 Statistics for Data Science Project

Jupyter Notebooks:

Team members:

Name Github Repo
Arjie Cristobal https://github.com/quickheaven

Introduction

Spambase Dataset

The "spam" concept is diverse: advertisements for products/web sites, make money fast schemes, chain letters, pornography...

Our collection of spam e-mails came from our postmaster and individuals who had filed spam. Our collection of non-spam e-mails came from filed work and personal e-mails, and hence the word 'george' and the area code '650' are indicators of non-spam. These are useful when constructing a personalized spam filter. One would either have to blind such non-spam indicators or get a very wide collection of non-spam to generate a general purpose spam filter.

Dataset Source:

Link: Spambase

Presentation

About

Repository dedicated to Term Project of UofT Statistics for Data Science Course

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published