〽️ Model Evaluation and Hyperparameter Tuning

This notebook covers the initialization of four common classification models, training on the iris dataset, evaluation of the models, and hyperparameter tuning to optimise the parameters of 3 models.

📚 Dataset - Iris

The iris dataset is a classification dataset which is small and perfect for beginer friendly classification tasks. It contains 150 samples of iris flowers, belonging to three species:

Setosa
Versicolor
Virginica

Each sample has 4 features (all numerical and continuous):

Feature	Description
`sepal length (cm)`	Length of sepal
`sepal width (cm)`	Width of sepal
`petal length (cm)`	Length of petal
`petal width (cm)`	Width of petal

Each flower (row) is labeled with a target value: 0 = setosa, 1 = versicolor, 2 = virginica

🏛️ Classification Models

Random Forest: A random forest classifier is a machine learning algorithm that builds multiple decision trees during training and combines their predictions to classify data. It's an ensemble method, meaning it leverages the collective intelligence of multiple models (the decision trees) to make more accurate and reliable predictions than a single decision tree could achieve.
Support Vector Machine (SVM): A Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression. It works by finding the optimal hyperplane that separates data points into different classes, maximizing the margin between them.
K-Nearest Neighbors (KNN): The k-nearest neighbors (KNN) algorithm is a non-parametric, supervised learning classifier, which uses proximity to make classifications or predictions about the grouping of an individual data point. It is one of the popular and simplest classification and regression classifiers used in machine learning today.
Naive Bayes: The Naïve Bayes classifier calculates the probability of a given instance belonging to a particular class based on the probabilities of its features. It assumes that the presence or absence of each feature is independent of the presence or absence of other features, which simplifies the calculations.