WaterPurity : A water potability predictor

WaterPurity is a data science project that aims to predict the potability (drinkability) of water based on various chemical properties. This project follows an end-to-end machine learning pipeline, including data ingestion, preprocessing, model training, evaluation, and deployment.

Project Overview

The project consists of the following main components:

Data Ingestion: Retrieves the water quality dataset from a specified URL.
Data Validation: Validates the ingested data against a predefined schema.
Data Preprocessing: Performs necessary preprocessing steps on the validated data, such as handling missing values, scaling, and feature engineering.
Model Training: Trains multiple machine learning models (Random Forest, SVM, etc.) on the preprocessed data.
Model Evaluation: Evaluates the trained models using metrics like Mean Absolute Error (MAE), R-squared, and Root Mean Squared Error (RMSE). The evaluation results are logged using MLflow for monitoring purposes.
Web Application: A Flask-based web application that allows users to input water quality parameters and receive predictions on water potability.
CI/CD Pipeline: A GitHub Actions workflow that automates the build, testing, and deployment of the project to an Amazon Elastic Container Registry (ECR) and an Amazon Elastic Compute Cloud (EC2) instance.

Technologies Used

Tech Stack:

WaterPurity pipeline simplified schema

Waterpurity Cloud

Video Demo :

MlFlow :

                                         __                                             __                
                                        /\ \__                                       __/\ \__             
                     __  __  __     __  \ \ ,_\    __   _ __   _____   __  __  _ __ /\_\ \ ,_\  __  __    
                    /\ \/\ \/\ \  /'__`\ \ \ \/  /'__`\/\`'__\/\ '__`\/\ \/\ \/\`'__\/\ \ \ \/ /\ \/\ \   
                    \ \ \_/ \_/ \/\ \L\.\_\ \ \_/\  __/\ \ \/ \ \ \L\ \ \ \_\ \ \ \/ \ \ \ \ \_\ \ \_\ \  
                     \ \___x___/'\ \__/.\_\\ \__\ \____\\ \_\  \ \ ,__/\ \____/\ \_\  \ \_\ \__\\/`____ \ 
                      \/__//__/   \/__/\/_/ \/__/\/____/ \/_/   \ \ \/  \/___/  \/_/   \/_/\/__/ `/___/  \
                                                                 \ \_\                              /\___/
                                                                  \/_/                              \/__/

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
.idea		.idea
.vscode		.vscode
artifacts		artifacts
assets		assets
config		config
mlruns/0		mlruns/0
notebook		notebook
src		src
static		static
templates		templates
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
application.py		application.py
hierarchy.py		hierarchy.py
main.py		main.py
params.yaml		params.yaml
requirements.txt		requirements.txt
schema.yaml		schema.yaml
setup.py		setup.py
test.py		test.py
waterpurity_cloud_arch.png		waterpurity_cloud_arch.png
waterpurity_pipeline.png		waterpurity_pipeline.png
waterpurity_stack.png		waterpurity_stack.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WaterPurity : A water potability predictor

Project Overview

Technologies Used

Tech Stack:

WaterPurity pipeline simplified schema

Waterpurity Cloud

Video Demo :

MlFlow :

About

Releases

Packages

Languages

yassineiscoding/waterpurity

Folders and files

Latest commit

History

Repository files navigation

WaterPurity : A water potability predictor

Project Overview

Technologies Used

Tech Stack:

WaterPurity pipeline simplified schema

Waterpurity Cloud

Video Demo :

MlFlow :

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages