Skip to content

Customer Churn Predictor: A machine learning model that predicts whether a customer is likely to churn or not.

License

Notifications You must be signed in to change notification settings

Dhanush-Raj1/Customer-Churn-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

94 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Customer Churn Predictor

End to End Machine Learning Project: "Predicing Customer Churn in a Telecommunication Company"


πŸ“Œ Churn Predictor

  • Customer Churn Rate (also known as attrition rate) refers to the percentage of customers who stop doing business with a company over a given period. It is a key metric used to measure customer retention and business performance.
  • The customer churn project aims at predicting the churn rate of a business in advance using machine learning algorithms. By analyzing historical customer data and various influencing factors, this model will help businesses take preventive actions to reduce churn.

🧱 Project Overview

  • Developed a machine learning model to predict whether a customer of a telecommunication company will churn.
  • Followed a modular structure for the entire project.
  • Utilized data of over 7000 records to train and develop the model.
  • Cleaned and preprocessed the raw data.
  • Performed feature transformation, scaled the numerical features and handled imbalance in the dataset.
  • Trained the model using various ML algorithms and selected the best one with higher accuracy.
  • Deployed the model using a Flask web application for real-time predictions.

πŸ“Œ Project Workflow

1. Data Collection:

  • Utilized the company's historical data of over 7000 records which includes information such as demographic details, services subscribed and account information.
  • For each customer the following information is available:
    • Gender
    • Senior Citizen
    • Partner
    • Dependents
    • Tenure
    • Phone Service
    • Multiple Lines
    • Internet Service
    • Online Security
    • Online Backup
    • Device Protection
    • Tech Support
    • Streaming TV
    • Streaming Movies
    • Contract Type
    • Paperless Billing
    • Payment Method
    • Monthly Charges
    • Total Charges

2. Data Cleaning & preprocessing:

  • Cleaned and preprocessed the raw data:
    • Handled missing values.
    • Removed duplicate records.
    • Removed outliers using zscore to avoid overfitting.
    • Replaced boolean values with numerical values.
    • Converted the values of tenure column in to bin values with a range of 12 months to ensure effective information understanding.

3. Exploratory Data Analysis and Feature Engineering:

  • Once the data is cleaned and preprocessed I analyzed the data to identify hidden patterns, relationships between features.

  • Implemented both single and cross feature analysis to find relationships betweent features.

  • Analyzed and visualized each feature to understand its values and the value counts to determine its overall importance.

  • Some of the major findings:

    • Among the entire customer base around 16% of them are senior citizens.
    • Customers who are more likely to churn have lower monthly and total charges.
    • Senior citizen customer have higher churn rates than non senior citizen customers.
    • The longer a customer stays with the business, the lower the chances of churning.
    • Customers with a tenure of within 1 years have equal chances of both churning and staying in the business.
    • Customers with a contract type of month-to-month have left the business more often.
  • Visualizations:

  • Distribution of tenure:

  • Imbalance in churn:

  • Monthly and Total Charges by churn:

4. Model Building:

  • Used different classification algorithms to train the model.
    • Logistic Regression
    • Naive Bayes
    • Knn Classifier
    • Decision Tree
    • Random Forest
    • Adaboost Classifier
    • Xgboost Classifier
    • Support Vector Classifier
  • Performed hyper parameter tunning using GridSearchCV to optimize and improve the performance models.
  • Evaluated the models with accuracy score and confusion matrix (percision, recall, f1 score) and selected the model with higher accuracy.
  • Out of all the algorithms used, Xgboost classifier had the highest accuracy of 81%.

5. Deployment:

  • Developed a Flask web application to deploy the model for real-time predictions.
  • Built both front-end and back-end components for the web app.
  • Created a custom website where users can enter customer data and receive predictions from the model.
  • Deployed the Flask app on local host server for easy access.

πŸ›  Tech Stack

Technology Description
Python Programming language used
Flask Web framework for UI and API integration
HTML & CSS Frontend design and styling
Pandas Cleaning and preprocessing the data
Numpy Performing numerical operations
Matplotlib Visualization of the data

πŸ“‚ Project Structure

/πŸ“‚Customer-Churn-Project
│── /πŸ“‚artifacts                     # Csv and pickel files 
β”‚   β”œβ”€β”€ data_cleaned.csv
β”‚   β”œβ”€β”€ test.csv
β”‚   β”œβ”€β”€ train.csv
β”‚   β”œβ”€β”€ model.pkl
β”‚   β”œβ”€β”€ preprocessor.pkl         
│── /πŸ“‚Data
β”‚   β”œβ”€β”€ data.csv                      # Raw data 
|   β”œβ”€β”€ data_eda.csv                  # Cleaned, preprocessed data
│── /πŸ“‚eda_images                     # Images of exploratory analysis
β”‚   β”œβ”€β”€ tenure.png
|   β”œβ”€β”€ churn.png
|   β”œβ”€β”€charges by churn.png
│── /πŸ“‚notebook                       # Research ipynb notebook
│── /πŸ“‚src                            # Source files (core files of the project)
|   β”œβ”€β”€exception_handling.py           # custom exception handling
|   β”œβ”€β”€logger.py                       # Logging messages
|   β”œβ”€β”€utils.py                        # Helper, utilities functions
|   │── /πŸ“‚components                 # Main components files    
|   |   │── data_cleaning.py         
|   |   │── data_ingestion.py
|   |   │── data_transformation.py
|   │── /πŸ“‚pipelines                    # Pipeline files
|   |   │── predict_pipeline.py
|   |   │── train_pipeline.py
│── /πŸ“‚static                         # Static folder
|   │── /πŸ“‚css                        # Css files 
|   |   │── hp_style.css              # Home page styles
|   |   │── pp_style.css              # Predict page styles
|   │── /πŸ“‚images                     # Website Images
│── /πŸ“‚templates                      # Templates (html files)
|   │── /home_page.html
|   │── /predict_page.html
│── .gitignore
│── README.md
│── app.py                            # Flask backend
│── requirements.txt                  # Python dependencies
│── setup.py                          # Setup

πŸš€ Installation & Setup

1️⃣ Clone the Repository

git clone https://github.com/Dhanush-Raj1/Customer-Churn-Project.git
cd Customer-Churn-Project

2️⃣ Create a Virtual Environment

conda create -p envi python==3.9 -y
source venv/bin/activate   # On macOS/Linux
conda activate envi     # On Windows

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Run the Flask App

python app.py

The app will be available at: http://127.0.0.1:5000/


🌐 Usage Guide

1️⃣ Open the web app in your browser.
2️⃣ Click the predict on the home page of the web app.
3️⃣ Enter the customer details in the respective dropdowns.
4️⃣ Click the predit button and the predicted results will appear.


πŸ“Έ Screenshots

🟠 Home Page


πŸ”΅ Predict Page


🟒 Results


🎯 Future Enhancements

βœ… Improved accuracy of the model with advanced fine tunning
βœ… Real-Time Prediction System
βœ… Automated Retraining Pipeline
βœ… Improve UI with a more interactive design.
βœ… Customer Retention Strategy Recommender.
βœ… Anomaly Detection for Unexpected Churn


🀝 Contributing

πŸ’‘ Contributions, issues, and pull requests are welcome! Feel free to open an issue or submit a PR to improve this project. πŸš€

πŸ“„ License

This project is licensed under the MIT License – see the LICENSE file for details.

About

Customer Churn Predictor: A machine learning model that predicts whether a customer is likely to churn or not.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages