- Project Overview
- Data Summary
- Exploratory Data Analysis
- Unsupervised Machine Learning Algorithms
- Results and Insights
- Conclusion
Customer Segmentation is a crucial business practice that involves grouping customers based on shared characteristics. This analysis leverages unsupervised machine learning to classify customers using behavioral and demographic data.
The insights help businesses:
- Target specific customer groups effectively.
- Enhance marketing strategies tailored to customer preferences.
- Boost customer satisfaction and loyalty.
The dataset comprises 2,000 customer records from an FMCG store, captured through loyalty card transactions. It includes 8 key features providing demographic and behavioral insights.
Feature | Description | Values |
---|---|---|
ID | Unique identifier for each customer. | Alphanumeric |
Sex | Gender of the customer. | 0: Male, 1: Female |
Marital Status | Marital status of the customer. | 0: Single, 1: Non-Single |
Age | Age of the customer. | Integer (Years) |
Education | Highest level of education attained. | 0: Other, 1: High School, 2: University, 3: Graduate School |
Income | Self-reported annual income (USD). | Integer (e.g., 25000, 50000, 75000) |
Occupation | Customer’s job category. | 0: Unemployed, 1: Skilled, 2: Management/Self-Employed |
Settlement Size | Type of city the customer resides in. | 0: Small, 1: Mid-Sized, 2: Big City |
1️⃣ Age Distribution:
- Visualized the age spread of customers across different regions.
2️⃣ Income Patterns:
- Analyzed income disparities based on education and occupation.
3️⃣ Gender Segmentation:
- Examined purchasing patterns for male and female customers.
4️⃣ Regional Insights:
- Compared settlement sizes to identify trends in small, mid-sized, and big cities.
🔹 Hierarchical Clustering:
- Groups customers into a dendrogram structure to reveal natural clusters.
- Provides insights into similar customer segments.
🔹 K-Means Clustering:
- Partition-based algorithm for segmenting customers into ‘K’ distinct clusters.
- Ideal for identifying prominent customer groups with similar spending habits.
🔹 PCA (Principal Component Analysis):
- Reduces dataset dimensions while retaining maximum variance.
- Simplifies data visualization and improves clustering accuracy.
1. Key Clusters Identified:
- High-income customers in big cities with graduate-level education.
- Budget-conscious customers from small cities.
- Mid-income customers prefer mid-sized cities.
2.Demographic Breakdown:
- Majority of the customers are aged between 25-45 years.
- Females showed a higher representation in the high-income segment.
3.Regional Insights:
- Big cities contribute to 60% of high-value purchases, while small cities show a preference for budget-friendly options.
4. Education and Spending Patterns:
- Graduate school customers demonstrated higher annual spending, while high school graduates leaned towards cost-efficient products.
The segmentation revealed actionable insights for tailoring marketing strategies, such as:
- Launching premium product campaigns for high-income clusters.
- Promoting value-based offerings in small city segments.
- Designing personalized loyalty programs for mid-income customers.
These findings empower businesses to drive customer-centric growth strategies and ensure sustainable profitability.
- 📩 Email: kshitijachilbule5@gmail.com
- 👩💻 Github: https://github.com/itskshitija
- 📶 LinkedIn: https://www.linkedin.com/in/kshitija-chilbule-b98515309/
- 🌐 Portfolio: https://itskshitija.github.io/My-Portfolio/