The project uses the public Olist e-commerce dataset (2016-2018), wich contains information about:
- Customers
- Orders
- Order items
- Order payments
- Products
- Sellers
- Product categories
Original dataset (Kaggle): "Brazilian E-Commerce Public Dataset by Olist".
The database is implemented in MySQL.
Main tables:
customersordersorder_itemsorder_paymentsproductsproduct_category_namesellers
The ER diagram is included in this repository:
olist_project.png
-
Database_Creation.sqlCreates the database ('olist_project') and sets the default schema. -
Tables_Creation.sqlContains all 'CREATE TABLE' staments and foreign keys -
Inconsistencies_Correction.sqlScripts used to fit data issues (Inconcistencies, types, constraints). -
SQL_queries_olist.sqlCollection of queries organized by difficulty. -
olist-project.pngER diagram of the database.
-
Create the database: SOURCE: Database_Creation.sql
-
Create all tables: SOURCE: Tables_Creation.sql
-
Apply data cleaning scripts: SOURCE: Inconsistencies_Correction.sql
-
Explore the analysus queries: SOURCE: SQL_queries_olist.sql
You can run these scripts from:
- MySQL client (mysql), or:
- any SQL IDE (e.g. DBeaver) connected to your local MySQL server.
- Add views for the most common analysis.
- Create stored procedures for reusables metrics (e.g. monthly revenue, cohor analysis).
- Build a Power BI/ Tableau dashboard on top of this schema.