Naïve Bayes

Overview

Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.

Assumptions,

Independence among features $x_i$ when class is given
The features $x_i$ make an equal contribution to the probability of the class
$P(x_1,x_2,...,x_n|y)=P(x_1|y)\times P(x_2|y)... \times P(x_n|y)$
Can estimate $P(x_i|y_j)$ for all $x_i$ and $y_j$

where $y$ is the class, $x_i$ are the features and $P(x_i|y)$ is the probability of the feature $x_i$ given the class $y$.

Formula

Posterior probability $P(y|x_1,x_2,...,x_n)$ for all values of $y$ can be calculated using the Bayes theorem, $$P(y|x_1,x_2,...,x_n)=\frac{P(x_1,x_2,...,x_n|y)\times P(y)}{P(x_1,x_2,...,x_n)}$$

Step-by-Step Implementation

A drugs dataset (Kaggle) was used with the columns as,

Age
Sex
BP
Cholesterol
Na_to_K
Drug class (y – dependent variable)

The dataset can be used to classify which person is on which drug. There are 5 classes of drugs – drugY, drugC, drugX, drugA, drugB.

See implementation in Jupyter Notebook

References

To learn more about Artificial Intelligence concepts, see Artificial Intelligence, Machine Learning, and Deep Learning..
Learn ML with Google Machine Learning Crash Course.

Home
Machine Learning
- Supervised Learning
- Unsupervised Learning
Deep Learning
Recommender Systems

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naïve Bayes

Overview

Formula

Step-by-Step Implementation

References

Clone this wiki locally