This project was an assignment for the Big Data class at University of Siena. You can look at the final report Big Data Project.pptx
for more details.
- Kaggle competition - a sample dataset of over 3 million grocery orders from more than 200.000 users. The orders include 32 million basket items and 50.000 unique products
- objective = predict which previously purchased products will be in a users next order
- classification problem - we need to predict whether each pair of user and product is a reorder or not
- used datasets can be downloaded here