Welcome to the documentation of my recommendation system project. This document provides a comprehensive overview of the creation, evaluation, improvement, and deployment of my recommendation system. My aim was to build a robust recommendation system that enhances user experiences in an e-commerce environment. The system encompasses three distinct recommendation types: "Recommended for You," "Similar Items," and "Frequently Bought Together." These recommendations are generated through collaborative filtering, content-based filtering, and frequent itemset mining techniques.
My development process was meticulously planned and executed, involving data cleaning, exploration, algorithm implementation, evaluation, and deployment. The stages of development include:
- Understanding the Data: I delved into the provided e-commerce dataset, identifying discrepancies, missing values, and potential issues.
- Data Preprocessing: I cleaned the dataset by addressing missing values, canceled orders, negative quantities, and zero-unit prices.
- Exploratory Data Analysis (EDA): Visualizations uncovered patterns, customer behaviors, and sales trends, guiding subsequent stages.
- Recommendation Generation: Collaborative filtering, content-based filtering, and frequent itemset mining techniques were employed to generate personalized suggestions.
- Evaluation and Metrics: The recommendation system's performance was evaluated using precision, recall, and Mean Average Precision (MAP).
- Deployment: The recommendation system was deployed as interactive Flask apps, allowing users to interact with recommendations.
- Documentation: This comprehensive document showcases the development process, results, analysis, and insights.
My data cleaning and exploration phase involved:
- Addressing Missing Values: I removed rows with missing values to ensure data quality.
- Handling Canceled Orders: Canceled orders and negative quantities were excluded to maintain data integrity.
- Eliminating Zero Unit Prices: Entries with zero-unit prices were removed for data consistency.
- Managing Duplicate Entries: Duplicate entries were identified and removed to ensure accurate analyses.
- Data Type Corrections: Data types were adjusted for accuracy and consistency.
- Ensuring Unique Descriptions: A dictionary mapping 'StockCode' to the most common 'Description' was created for consistency.
My recommendation system generates three types of recommendations:
- User-Item Interaction Matrix: A matrix represents interactions between users and items.
- Similarity Calculation: User similarity is computed based on interactions.
- Neighborhood Selection: Similar users form a "neighborhood."
- Item Ranking: Items are ranked based on user neighbors' interactions.
- Final Recommendations: Top-ranked items are recommended, avoiding redundancy.
- Item-Item Similarity Matrix: A matrix represents item similarities.
- Item Similarity Calculation: Item similarity metrics are calculated.
- Item Ranking: Items are ranked based on similarity scores.
- Recommendation Generation: Top-ranked similar items are recommended.
- Purchase History Data: User purchase histories capture multiple-item transactions.
- Frequent Itemset Mining: Frequent itemsets are identified using algorithms.
- Association Rule Generation: Association rules express item relationships.
- Confidence and Support: Rules are evaluated using confidence and support.
- Recommendation Generation: High-confidence rules inform recommendations.
I developed interactive Flask apps to showcase my recommendation system's functionality:
- Recommended for You App: Provides personalized recommendations based on user preferences and interactions.
- Similar Items App: Allows users to explore items similar to their selected product.
- Frequently Bought Together App: Suggests items frequently purchased alongside a selected item.
Building these recommendation systems has deepened my understanding of data science and machine learning. I'm excited to contribute, learn more, and drive innovation and success.