This project was completed as part of PG Level Advanced Certification Programme in Computational Data Science coursework at Centre for Continuing Education - Indian Institute of Science in collaboration with Talent Sprint
A special thanks to Prof. Shashi Jain & Mentor Mr. Sachin Sharma
Problem Statement: Extract association rules and find groups of frequently purchased items from a large-scale grocery orders dataset.
Module: Business Analytics
Project Type: Team
A comprehensive analysis of Instacart's customer purchase patterns using market basket analysis techniques. The project analyzes over 3 million grocery orders from 200,000+ Instacart users to uncover shopping patterns, product associations, and temporal trends.
- Over 3 million grocery orders
- 200,000+ Instacart users
- Data spread across multiple files:
- orders.csv
- products.csv
- aisles.csv
- departments.csv
- order_products_train.csv
- Merged multiple data sources into a unified dataset
- Handled missing values and data transformations
- Created purchase frequency matrices
- Product frequency analysis
- Department-wise purchase patterns
- Temporal analysis (day of week, hour of day)
- Reorder behavior analysis
- Implemented Apriori algorithm
- Generated association rules
- Analyzed product co-occurrence patterns
- Most popular product: Bananas
- Peak ordering hours: 10 AM - 4 PM
- Higher order frequencies during weekends
- Clear patterns in reorder behavior
- Generated frequent itemsets with minimum support of 0.01
- Identified strong product associations using lift metric
- Discovered valuable product grouping patterns
- Python
- Pandas
- NumPy
- Seaborn
- Matplotlib
- MLxtend (for Apriori algorithm)
- Scipy
- Apriori Algorithm
- Inventory optimization
- Store layout recommendations
- Targeted marketing strategies
- Product recommendation systems
- Staffing optimization
- Reorder prediction
- Instacart for providing the dataset
Note: This project is for educational purposes and uses a public dataset from Instacart - Please download this by yourself.