- MacOS 12.3.1
- MySQL Workbench 8.0.31 Community
- PyCharm 2023.1 (Community Edition)
- Python 3.11 Interpreter
- Tableau Public 2023.1.0
- Execute the provided finance_liquor_sales.sql in MySQL Workbench to create the necesseary schema.
- Execute the sql_query.sql to retrieve all columns between 2016 and 2019
- Export the results in a csv file. liquor_sales.csv
After calculating the percentage of missing values per column, it becomes evident that a significant portion of the missing value data is observed in the store_location column (12.16%) and the category_name column (8.11%). However, as the significant columns here are not category_name and store_location, it is not necessary to drop the indexes with missing values.
After aggregating the data in the CSV file, I assigned the zip_code column as the values for the x-axis and the sum of bottles sold column as the values for the y-axis. Subsequently, I generated a colormap for each unique item_description based on the gnuplot2 colorspace of the matplotlib library.
For details, see the liquor_sale.py
The CSV file generated in the previous step was utilized to create visualizations in Tableau.