-
Notifications
You must be signed in to change notification settings - Fork 0
/
naives-Bayes.Rmd
55 lines (49 loc) · 1.19 KB
/
naives-Bayes.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
title: "R Notebook"
output: html_notebook
---
```{r}
#Load Libraries
library(naivebayes)
library(caret)
```
```{r}
#Getting the dataset
df <- read.csv("amirsecretfile.csv")
head(df)
```
```{r}
#Setting outcome variables as categorical
#Setting outcome variables as categorical
df$gender <- ifelse(df$gender=="F",0,1)
df$City <- as.factor(df$City)
df$gender <- as.factor(df$gender)
```
```{r}
#Studying the structure of the data
str(df)
```
```{r}
#checking for missing values
sum(is.na(df))
```
```{r}
#train and test data sets
set.seed(1234)
ind <- sample(2, nrow(df), replace = T, prob = c(0.8, 0.2))
train <- df[ind == 1,]
test <- df[ind == 2,]
```
```{r}
#Model Building
model <- naive_bayes(gender ~ FamilyIncome + EdYears + Grocery + Cosmatics + MF +
BoughtCosmatics + Response + City, data = train, usekernel = T)
summary(model)
```
```{r}
#Prediction
Predict <- predict(model,newdata = test )
#create confusion matrix
confusionMatrix(test$gender, Predict)
```
The final output shows that I have built a Naive Bayes classifier that can predict whether a person is male or female, with an accuracy of approximately 99%.