Skip to content

This project harnesses the power of cutting-edge technologies including H2O AutoML, MLflow, FastAPI, and Streamlit to enhance cross-selling campaigns and boost efficiency.

Notifications You must be signed in to change notification settings

tushar2704/Insurance-Cross-Sell

Repository files navigation

Insurance Cross-Sell Prediction Project

Python MLFLOW Streamlit FASTAPI Microsoft Excel Microsoft PowerPoint Visual Studio Code Windows Terminal Pandas GitHub

Welcome to the Insurance Cross-Sell Prediction Project, a comprehensive end-to-end AutoML solution designed to revolutionize the insurance industry. This project harnesses the power of cutting-edge technologies including H2O AutoML, MLflow, FastAPI, and Streamlit to enhance cross-selling campaigns and boost efficiency.

Table of Contents

Business Overview

Cross-selling in the insurance sector is all about offering complementary products to existing policyholders. It aims to provide customers with comprehensive protection at an attractive bundled cost while helping insurers increase their revenue through enhanced policy conversions. Our project focuses on identifying health insurance customers who are likely to be interested in purchasing additional vehicle insurance, making cross-sell campaigns more efficient and targeted.

Technical Insight

Traditionally, developing machine learning models for such tasks has been a time-consuming and resource-intensive process that demands significant technical expertise. However, with the advent of Automated Machine Learning (AutoML), the process has become streamlined and efficient. Beyond model development, there are various components in a production-ready ML system. This project demonstrates how to set up, train, and serve such a system using the following technologies:

  • H2O AutoML: To automate the model selection and tuning process.
  • MLflow: For tracking and managing machine learning experiments.
  • FastAPI: To deploy the best H2O model for real-time inference.
  • Streamlit: For creating an intuitive user interface to interact with the model.

Objective

Our main objective is to make cross-selling more efficient and targeted by building a predictive ML pipeline. This pipeline identifies health insurance customers who are interested in purchasing additional vehicle insurance, enabling insurance companies to focus their cross-sell efforts effectively.

Pipeline Components

  1. Data Acquisition and Preprocessing: The foundation of any successful ML project begins with data. We prepare and preprocess our data to ensure it's ready for modeling.

  2. H2O AutoML Training with MLflow Tracking: We leverage H2O AutoML to automatically explore and build the best machine learning model, tracking each step with MLflow.

  3. Deployment of Best H2O Model via FastAPI: Once we have our winning model, it's deployed using FastAPI, providing a robust and efficient API for real-time predictions.

  4. Streamlit User Interface: We create an intuitive Streamlit user interface that allows users to input data and receive predictions seamlessly.

User Interface Demo

UI Demo

Project Structure

  • /backend: Contains files for setting up the backend, including the H2O ML model and FastAPI instance.
  • /data: Houses raw data, processed data, and output data in JSON format.
  • /mlruns: Stores artifacts from ML training experiments.
  • /utils: Contains Python scripts with helper functions.
  • main.py: The script for selecting the best H2O model and deploying it as a FastAPI endpoint.
  • Dockerfile: Used to build the backend service as a Docker container.
  • train.py: The script for executing H2O AutoML training with MLflow tracking.
  • /frontend: Contains the frontend user interface (UI) aspect of the project built using Streamlit.
  • app.py: The script for the Streamlit web app, connected to the FastAPI endpoint for model inference.
  • Dockerfile: Used to build the frontend service as a Docker container.
  • /demo: Includes GIF and webm files showcasing the Streamlit UI demo.
  • /notebooks: Contains Jupyter notebooks for EDA, XGBoost baseline, and H2O AutoML experiments.
  • submissions: Holds CSV files for Kaggle submission to retrieve model accuracy scores.

References

For more in-depth information, check out these valuable resources:

This project aims to transform the insurance industry by making cross-selling smarter and more efficient. We hope you find it informative and inspiring for your own machine learning endeavors!

About

This project harnesses the power of cutting-edge technologies including H2O AutoML, MLflow, FastAPI, and Streamlit to enhance cross-selling campaigns and boost efficiency.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages