Skip to content

thecodedcoder/customer-segmentation-campaign-analysis

Repository files navigation

Customer Segmentation & Campaign Analysis

An RFM (Recency, Frequency, Monetary) segmentation study on 3,745 customers from the UCI Online Retail II dataset, layered with marketing channel performance analysis to identify where budget should go.

Key Findings

KPI Value
Customers 3,745
Total Revenue $882K
Avg Order Value $70
Total Invoices 12,580
Segments 8
Channels 5
Period Dec 2010 -- Dec 2011

1. Champions Are the Business

Champions make up 20.8% of customers but generate 61.3% of all revenue ($540K). Losing even a small fraction of Champions would hurt more than losing most other segments combined.

2. 922 Dormant Customers Represent a Win-Back Opportunity

Dormant is the largest segment by headcount (24.6%) but contributes only $31K in revenue. These customers have bought before but haven't returned -- a targeted re-engagement campaign could recover an estimated $4,600.

3. Email Is the Highest-Revenue Channel

Channel Customers Revenue Avg Order Cost/Customer Avg ROI
Email 1,199 $287,133 $240 $2.50 9,479%
Paid Social 739 $186,923 $253 $18.00 1,305%
Referral 619 $178,385 $288 $7.50 3,742%
Organic Search 827 $136,062 $165 $5.00 3,191%
Direct 361 $93,486 $259 $1.00 25,797%

Email dominates Champion acquisition. Paid Social disproportionately brings in Dormant customers, explaining its weaker ROI despite high revenue volume.

4. Full Segment Breakdown

Segment Customers Avg Recency Avg Orders Avg Spend Total Revenue
Champions 779 16 days 8.9 $694 $540,322
Needs Attention 432 169 days 2.0 $312 $134,864
Loyal Customers 706 33 days 3.0 $135 $94,928
At Risk 169 126 days 4.2 $302 $51,022
Dormant 922 193 days 1.1 $34 $30,893
Potential Loyalists 572 38 days 1.4 $48 $27,545
Lost 132 272 days 1.0 $15 $1,921
New Customers 33 28 days 1.0 $15 $495

RFM Methodology

Each customer receives three independent scores from 1 to 5:

Dimension What It Measures Score Logic
Recency Days since last purchase Score 5 = bought recently, Score 1 = long time ago
Frequency Number of unique orders placed Score 5 = orders frequently, Score 1 = ordered once
Monetary Total money spent Score 5 = highest spenders, Score 1 = lowest spenders

The combined RFM score (3--15) determines which of the 8 behavioural segments a customer belongs to.

Strategic Recommendations

  1. Protect Champions at All Costs -- Build a dedicated loyalty programme, early product access, and personalised email communication for this group. Losing 10% of Champions costs more revenue than losing 50% of Dormant customers.
  2. Launch a Win-Back Campaign for Dormant Customers -- A targeted re-engagement email sequence with personalised discounts and past-purchase recommendations could convert 15% back into active buyers, recovering ~$4,600.
  3. Reduce Paid Social Spend, Invest in Referral -- Paid Social costs $18/customer and attracts lower-value customers. Referral costs $7.50 and delivers the highest avg order value ($288). Reallocating 30% of Paid Social budget would improve ROI significantly.
  4. Nurture Potential Loyalists Early -- 572 customers have bought recently but infrequently. A triggered email sequence after their second purchase with a loyalty incentive could move them into the Loyal segment within 90 days.

Project Structure

.
├── marketing_analysis.py                # RFM segmentation + channel performance analysis
├── retail_sample.csv                    # Sampled dataset (from UCI Online Retail II)
├── marketing_dashboard.html             # Interactive HTML dashboard with Chart.js
├── marketing_analytics_portfolio.pdf    # 9-page PDF report with charts and findings
└── README.md

File Descriptions

  • marketing_analysis.py -- End-to-end pipeline: data cleaning, RFM score calculation, customer segmentation into 8 groups, simulated campaign channel assignment, ROI computation, and 5 Matplotlib visualisations (donut chart, revenue bars, channel performance, RFM scatter plot, channel mix).
  • retail_sample.csv -- Sampled subset of the UCI Online Retail II dataset with columns: Invoice, StockCode, Description, Quantity, InvoiceDate, Price, Customer ID, Country.
  • marketing_dashboard.html -- Self-contained interactive dashboard featuring: KPI strip, segment donut chart, revenue by segment bar chart, full segment breakdown table, channel revenue vs ROI, channel mix by segment, channel performance table, and 4 strategic recommendations.
  • marketing_analytics_portfolio.pdf -- 9-page PDF report with executive summary, RFM methodology, segmentation results, customer behaviour map, channel performance analysis, and strategic recommendations.

Dataset

  • Source: UCI Online Retail II
  • Citation: Chen, D. (2015). Online Retail II [Dataset]. UCI Machine Learning Repository.
  • Licence: CC BY 4.0
  • Period: December 2010 -- December 2011
  • Records: ~500K transactions (sampled for this analysis)

Tools & Technologies

  • Python 3 -- Core analysis language
  • Pandas & NumPy -- Data manipulation, RFM computation
  • Matplotlib & Seaborn -- Static chart generation
  • ReportLab -- PDF report generation
  • Chart.js -- Interactive browser-based visualisations
  • HTML/CSS -- Dashboard layout and styling

How to Run

# Install dependencies
pip install pandas numpy matplotlib seaborn

# Run the analysis
python marketing_analysis.py

# View the interactive dashboard
open marketing_dashboard.html    # macOS
xdg-open marketing_dashboard.html  # Linux

Author

Gbolahan Akande -- Data Analyst


Portfolio case study | Marketing Analytics | RFM Segmentation

About

RFM customer segmentation and marketing channel performance analysis using the UCI Online Retail II dataset. Segments 3,745 customers into 8 behavioural groups and identifies where marketing budget should go.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors