Code for Bayesian Classification of randomly generated samples of multivariate Gaussian data, created for pattern recognition class assignment
The core of any Bayes classifier is Bayes formula, which states:
Here,
In designing a Bayes classifier for multivariate data, this formula will become the basis of our discriminant functions, which are a set of equations
Then, taking the natural log of this equation gives
This manipulation makes
In this case, we will choose
Now, if we assume
Substituting this expression into Equation \ref{log_discriminant} gives the discriminant function:
This equation is the general discriminant function used for multivariate Gaussian data, and is what will be used to classify the data generated in this assignment. As discussed in class, there are 3 different scenarios in which this equation can be further discussed: case I (
In the most simple case, (i.e., case I), we assume the features
By expanding and further simplifying this equation, we can see that the discriminant function for case I is linear, with
where:
For the data in sample set A,
By calculating the values of
In the case of sample set B, the covariance matrices are not equal. Since
It can be noted that for case III, we no longer make the assumption that the features are uncorrelated (i.e., the covariance matrices do not need to be diagonal). This is also true for case II, though we will not discuss case II any further since neither the data from sample set A or B fall into this category. For dataset B, the priors, misclassification rates, and probability of error will be computed the same way as is described in Part 1 above.
For both sample set A and B, we are given the instruction to produce n samples from
In the case that
For this to be an optimum classifier (i.e., error is minimized), all priors must be equal, and the covariance matrices for all
This assignment asks us to report the misclassification rate of each class and the total misclassification rate. For determining misclassification rates, we can compare the true class that each sample came from with the expected class for each sample (determined by assigning each sample
The theoretical probability error is often determined using the Bhattacharyya error bound, which is defined as
where
For the Bhattacharyya error bound,