Spam Dataset

This dataset consists of 4601 email observations, each labelled as spam (1) or not spam (0). There are 57 predictors, each being the relative frequencies of the most commonly occuring words and symbols in the email.

We use gradient boosting in R and model blending techniques to improve our accuracy. We also use desicion trees, and demonstrate how R can create tree plots.

This dataset is discussed in "The Elements of Statistical Learning, II edition". The data is also available at ftp.ics.uci.edu.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
images		images
README.md		README.md
code.R		code.R
data.txt		data.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Dataset

About

Releases

Packages

Languages

Robby955/spam

Folders and files

Latest commit

History

Repository files navigation

Spam Dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages