Python | Jupyter Notebook | Classification
Objective: Explore customer transaction data, from recent online and in-store sales, and see if you can infer any insights about customer purchasing behavior.
The following questions should be addressed:
- Do customers in different regions spend more on purchases?
- Are in-store customers older than online customers?
- What age spends the most money on purchases?
- Which customers are more likely to purchase online?
Python Version: 3.7
Packages: pandas, numpy, sklearn, matplotlib, seaborn, pandas_profiling
Supervised learning approach: Classification
Data Composition: Dataset was comprised of $66M worth (or 80,000 rows) of transactional data containing the following 5 attributes.
- Type of Purchase (Online or In-Store)
- Amount of Purchase ($5 - $3000)
- Number of Items Purchased (1 - 8 items)
- Customer Age (18 - 85 years old)
- Region of Purchase (North, South, East, West)
Conclusion: (comprehensive conclusion attached as a powerpoint.)
- Customers in the Western Region spend the most per purchase and account for $33M (or 50%) of total sales.
- The Southern Region had nearly 20,000 transactions under $500 and only 34 transactions over $500.
- Regardless of the number of items purchased, the average cost per transaction didn’t waver.
- Customers under 62 years old account for 93% of total sales.
- Customers under 40 years old spend the most per purchase.
- Average age is 43.
- Shops 100% in-store.
- Spends $745 on average per transaction.
- Oldest region - average age is 56.
- Shops 100% online
- Spends less than other regions with $252 average per transaction.
- Average age is 45.
- Shops 61% online & 39% in-store.
- Spends $918 on average per transaction.
- Youngest region - average age is 38.
- Shops 50% online & 50% in store.
- Spends most than other region with $1,284 average per transaction.
Business Recommendations:
- Focus marketing efforts in Northern & Southern region to increase sales.
- Run promotional sales for higher-priced products in the South Region.
- The Southern Region shops 100% online, spends the least and is the oldest. If there is a brick & mortar store in this region, the recommendation would be to close the store to reduce expenses. Alternatively, opening a brick & mortar store, if there is not one, could bring in more profits.
Future Data Mining Considerations:
- Researching the Southern Region to understand local consumer base and existing business assets to determine if future assets should be established or not.
- Capture the date of the transaction to determine seasonality/predictability of purchases.
- Capture transaction time to determine optimal hours of operation for a store and aid with proper staffing of the store.
- Gather additional customer data to track customer retention.
- With customer retention data, better project and obtain sales goals.
- Collect product information to understand what items are being purchased in-store vs. online.
- Collect gender information to better understand customer demographics.