Skip to content

d2k-tech/datascience-assignment-level-1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Data Science Assignment: Credit Card Customer Analysis

Objective

Your task is to conduct a comprehensive analysis of credit card customers using the provided dataset. This assignment is designed to assess your proficiency in data manipulation, visualization, customer segmentation, predictive modeling, and your ability to use Git for version control. Dataset

You will be working with the "Credit Card Customers" dataset available on Kaggle. This dataset includes information on customers' age, salary, marital status, credit card limit, credit card category, and more.

Dataset Link: Credit Card Customers

Please download the dataset directly from Kaggle and include it in your project repository in a dedicated data folder. Tasks

Part 1: Data Exploration and Preprocessing

Data Understanding: Load the dataset and perform an initial exploration to understand its structure, identify missing values, and gather basic statistics.
Preprocessing: Clean the dataset by handling missing values, outliers, and any erroneous data points. Document your decisions.

Part 2: Data Analysis and Visualization

Customer Demographics Analysis: Analyze the demographics of the credit card holders (e.g., age, salary, marital status) and visualize the distributions.
Credit Usage Analysis: Explore how different demographics correlate with credit card limit, balance, and category. Identify any interesting patterns.

Part 3: Customer Segmentation

Segmentation Model: Use clustering techniques (e.g., K-Means) to segment the customers based on their credit card usage and demographic data. Determine the optimal number of clusters.
Segment Analysis: Analyze each customer segment to identify unique behaviors and characteristics. Provide actionable insights for targeted marketing strategies.

Part 4: Predictive Modeling

Churn Prediction: Build a predictive model to forecast customer churn based on the features available in the dataset. Experiment with at least two different algorithms and compare their performance.
Model Evaluation: Assess the models using appropriate performance metrics. Discuss the strengths and weaknesses of each model.

Submission Guidelines

Create a new GitHub repository specifically for this assignment. 
Make sure the repository is public to allow evaluation.
Include a README.md file that provides an overview of your project, how to run your code, and a summary of your findings and insights.
Organize your repository with clear directories for the dataset, scripts/notebooks, and any additional resources used or created.
Commit your changes with clear and descriptive messages. Demonstrate effective use of version control throughout the project.

Evaluation Criteria

Problem Solving: Your approach to preprocessing, analyzing, and modeling the data.
Code Quality: Readability, structure, and documentation of your code.
Git Usage: Frequency and clarity of commits, including branching and merging practices.
Insights and Recommendations: Depth of the insights drawn from the analysis and the practicality of your recommendations.
Model Performance: Accuracy and robustness of your predictive models.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published