From 2890bfabb0af1b84797dff2f2ce9ac10450c9b82 Mon Sep 17 00:00:00 2001 From: Vanessa Liang Date: Wed, 11 Nov 2020 17:05:21 -0800 Subject: [PATCH] Update README.md --- prediction/spam-detection/README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/prediction/spam-detection/README.md b/prediction/spam-detection/README.md index fe72ace..a2bd0c7 100644 --- a/prediction/spam-detection/README.md +++ b/prediction/spam-detection/README.md @@ -1,8 +1,12 @@ # Spam Email Detection _Create models to predict if an email message is spam or not_ +## Data +- The dataset contains 5572 messages, with 13.41% marked as spam message + + + ## Data Exploration -- Percentage of spam in the data is 13.41% - Finding 1: Spam message tends to be longer - Finding 2: Spam message tends to have more digits - Finding 3: Spam message tends to have more non-word characters