This project applies the Gaussian Naive Bayes algorithm to classify individuals into gender categories based on their physical measurements. The project evaluates the classifier's accuracy using various feature sets and compares the effectiveness of different Gaussian Naive Bayes configurations.
- Implement the Gaussian Naive Bayes classifier.
- Analyze the impact of different features on classification accuracy.
- Assess the classifier's performance with cross-validation.
- Determine the effect of removing the 'Age' feature on accuracy.
- Programming Language: Python
- Libraries:
numpy
for numerical operationsmath
for mathematical functions
- Training Data: Includes features like height, weight, age, and gender.
- Test Data: Used to evaluate the classifier, consisting of similar features without labels.
- Script:
Project1_Q2(a)_Alishbah_Fahad.py
- Processes raw data, converting gender labels from characters to binary classes (
1
for female,0
for male).
- Script:
Project1_Q2(b)_Alishbah_Fahad.py
- Implements the classifier, calculating probabilities using Gaussian distribution.
- Script:
Project1_Q2(c)_Alishbah_Fahad.py
- Applies k-fold cross-validation to assess model performance.
- Script:
Project1_Q2(d)_Alishbah_Fahad.py
- Analyzes the impact of removing the 'Age' feature on the classifier's accuracy.
- Document:
Project1_Q2(e)_Alishbah_Fahad.pdf
- Discusses findings, noting that the KNN model generally outperforms Gaussian Naive Bayes unless the 'Age' feature is removed.