Skip to content

The project implements a complete machine learning pipeline to classify iris flower species using the classic Iris dataset. The workflow is built and tested on Amazon SageMaker Studio Lab, simulating an end-to-end ML project lifecycle from preprocessing to evaluation.

License

Notifications You must be signed in to change notification settings

akhil-b-26/iris_classification_sagemaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŒธ Iris Classification using Random Forest โ€“ SageMaker Studio Lab

This project implements a complete machine learning pipeline to classify iris flower species using the classic Iris dataset. The workflow is built and tested on Amazon SageMaker Studio Lab, simulating an end-to-end ML project lifecycle from preprocessing to evaluation.


๐Ÿ“Œ Project Overview

  • Objective: Predict the iris species (Setosa, Versicolor, Virginica) based on sepal and petal dimensions.
  • Algorithm Used: Random Forest Classifier
  • Platform: SageMaker Studio Lab (CPU instance)
  • Dataset: Iris dataset (150 samples, 4 features, 3 classes)

๐Ÿ”ง Tools & Libraries

  • Python 3
  • Scikit-learn
  • Pandas
  • NumPy
  • Matplotlib / Seaborn (optional for visualization)
  • SageMaker Studio Lab (environment)

๐Ÿงช Pipeline Steps

  1. Import Dataset

  2. Data Exploration & Visualization

  3. Preprocessing & Feature Selection

  4. Train-Test Split

  5. Model Training (Random Forest)

  6. Prediction & Evaluation

    • Accuracy
    • Confusion Matrix
    • Classification Report

๐ŸŽฏ Results

  • Model Used: Random Forest Classifier
  • Accuracy Achieved:ย 90%
  • Training Time: Under 1 second (due to small dataset)

๐Ÿ“ Folder Structure

 โ”œ๏ธ ๐Ÿ“„ iris_classifier_sagemaker.ipynb
 โ”œ๏ธ ๐Ÿ“„ README.md

๐Ÿš€ How to Run

  1. Open SageMaker Studio Lab
  2. Upload iris_classifier.ipynb
  3. Choose CPU runtime
  4. Run each cell to execute the pipeline

๐Ÿง Learning Outcomes

  • End-to-end pipeline construction using Scikit-learn
  • Using Random Forest for multi-class classification
  • Model evaluation techniques and interpretation
  • Hands-on with SageMaker Studio Lab (AWS-hosted Jupyter environment)

๐Ÿ”ฎ Future Work

  • Integrate with AWS SageMaker SDK for deployment
  • Compare with other models like SVM, XGBoost, KNN
  • Add interactive dashboard using Gradio or Streamlit
  • Perform hyperparameter tuning for improved accuracy
  • Expand to more complex, real-world datasets

About

The project implements a complete machine learning pipeline to classify iris flower species using the classic Iris dataset. The workflow is built and tested on Amazon SageMaker Studio Lab, simulating an end-to-end ML project lifecycle from preprocessing to evaluation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published