This project analyzes the Brazilian E-Commerce Public Dataset by Olist, a comprehensive dataset that includes sales, customer reviews, logistics, and geolocation data. It serves as the foundation for addressing key business questions and building a performance dashboard.
- What are the trends in sales metrics (orders, revenue, units sold, Average Transaction Value (ATV), Average Unit Revenue (AUR)) over the past months?
- Which product categories have the highest sales in terms of orders, revenue, and units sold?
- Which product categories have the lowest sales in terms of orders, revenue, and units sold?
- What is the most frequently used payment method?
- What is the average payment value for each payment method?
- How accurate are the estimated delivery dates compared to the actual delivery dates?
- What are the minimum, maximum, and average shipping costs per product category (both in total and average per order)?
- What is the overall customer satisfaction level based on review scores?
- How are customers and sellers distributed geographically?
The notebook includes the following sections:
- Business Questions: Outlines the objectives of the analysis.
- Import Packages & Libraries: Prepares the environment for analysis.
- Data Wrangling: Processes the data through gathering, assessing, and cleaning.
- Exploratory Data Analysis (EDA): Identifies trends, patterns, and anomalies.
- Visualization & Explanatory Data Analysis (ExDA): Provides insights through visualizations.
- Geospatial Analysis: Examines geographic distributions.
- Saving Dataset for Dashboard: Prepares cleaned data for visualization tools.
- Conclusion: Summarizes findings and actionable insights.
- Line Charts:
- Monthly Sales Trend (Revenue & ATV)
- Monthly Sales Trend (Orders, and AUR)
- Horizontal Bar Charts:
- Top product categories by orders, revenue, and units sold.
- Bottom product categories by orders, revenue, and units sold.
- Pie Chart: Most frequently used payment methods.
- Bar Chart: Average payment values by payment method.
- Pie Chart: Distribution of delivery statuses.
- Boxplots: Freight costs per product category.
- Bar Chart: Review score frequency distribution.
- Map:
- Customers Location Distribution
- Sellers Location Distribution
Note: The geolocation tab is temporarily unavailable due to memory constraints on Streamlit Cloud. Efforts are underway to optimize and re-enable this feature.
Watch Live & Local Dashboard Preview
To run the dashboard locally:
-
Clone the Repository:
git clone https://github.com/arguto1993/olist-ecommerce-performance.git cd olist-ecommerce-performance
-
Install Dependencies:
Ensure you have Python installed. Use the following command to install the required packages:
pip install -r requirements.txt
-
Run the App:
Launch the Streamlit app with::
streamlit run dashboard.py
-
Access the Dashboard:
Open the URL provided in the terminal (default: http://localhost:8501) to view the dashboard.
This project is developed as part of my submission for the Dicoding Indonesia Data Science Bootcamp Batch 4 (2024).
If you find this project helpful or have ideas for improvements, I’d love to hear from you! Thank you :)