This repository contains the deliverables for the Data Science Intern assignment, focusing on analyzing eCommerce transaction data. The assignment is divided into three key tasks: Exploratory Data Analysis (EDA), Lookalike Model development, and Customer Segmentation using clustering techniques. Each task aims to derive insights and provide actionable recommendations for business improvement.
The eCommerce Transactions dataset consists of three files:
- Customers.csv: Contains customer profiles, including ID, name, region, and signup date.
- Products.csv: Lists product details such as ID, name, category, and price.
- Transactions.csv: Contains transaction data, including transaction ID, customer ID, product ID, date, quantity, and total value.
- Perform EDA to uncover insights about customers, products, and sales trends.
- Develop a Lookalike Model to identify similar customers based on profile and transaction data.
- Use clustering techniques for customer segmentation and derive actionable insights.
- Objective: Analyze the dataset to uncover key business insights, such as customer distribution, sales trends, and top-performing products.
- Deliverables:
Anvita_Magarde_EDA.ipynb
: Jupyter Notebook containing the EDA process.Anvita_Magarde_EDA.pdf
: PDF report summarizing insights (maximum 500 words).
- Objective: Build a model to recommend the top 3 most similar customers for a given customer based on their profile and transaction history.
- Deliverables:
Anvita_Magarde_Lookalike.ipynb
: Notebook explaining the model development.Anvita_Magarde_Lookalike.csv
: CSV file containing lookalike recommendations and similarity scores.
- Objective: Use clustering algorithms to group customers into segments based on profile and transaction data. Evaluate clusters using metrics like the Davies-Bouldin Index.
- Deliverables:
Anvita_Magarde_Clustering.ipynb
: Notebook containing clustering analysis and visualization.Anvita_Magarde_Clustering.pdf
: PDF report summarizing clustering results.
-
Clone the repository:
git clone https://github.com/yourusername/eCommerce-Transactions-Analysis.git cd eCommerce-Transactions-Analysis
-
Install dependencies:
pip install -r requirements.txt
-
Open and execute the Jupyter Notebooks:
jupyter notebook
-
Run each notebook in the following order:
Anvita_Magarde_EDA.ipynb
Anvita_Magarde_Lookalike.ipynb
Anvita_Magarde_Clustering.ipynb
-
Review the generated outputs in the
Outputs
folder.
- Python 3.8+
- Libraries:
- pandas
- numpy
- matplotlib
- seaborn
- scikit-learn
- jupyter