Chad Hillman
Mini Project 2
January 2025.
This project focuses on extracting, transforming, and loading (ETL) crowdfunding data to clean and analyze the data. The goal is to create a well-structured database from the provided excel spreadsheets based on Module1 homework. We will be using jupyter notebook, QuickDBD to create the schema from our csv outputs and then to create the 4 tables usign SQL via PgAdmin4.
ETL_Mini_Project_C_Hillman.ipynb
crowdfunding_db_schema.png
crowdfunding_db_schema.sql
contacts.xlsx
crowdfunding.xlsx.
campaign.csv
category.csv
contacts.csv
subcategory.csv
Crowdfunding_ETL/
│
├── data/
│ ├── contacts.xlsx
│ ├── crowdfunding.xlsx
│ └── ...
│
├── notebooks/
│ └── ETL_Mini_Project_C_Hillman.ipynb
│
├── output/
│ ├── campaign.csv
│ ├── category.csv
│ ├── contacts.csv
│ └── subcategory.csv
│
└── schema/
├── crowdfunding_db_schema.png
└── crowdfunding_db_schema.sql
- Load in the
crowdfunding.xlsx
file and create thecategory
andsubcategory
DataFrames. - Amend the crowdfunding data to create the
campaign
DataFrame. - Remove unwanted columns and merge the the
category
&subcategory
DataFrames with the newly createdcampaign
DataFrame. - Rename the date columns and update data types to
datetime
. - Change data types for
goal
andpledged
frominteger
tofloat
. - Load in the
contacts.xlsx
file to create thecontact
DataFrame. - Print out the dictionary of contacts and create columns, split the
name
column intofirst
andlast
, and reorder as instructed using option 1. - Save each DataFrame as a CSV and store them in the
resource
folder. - Using the DataFrame names and column headers, create the schema for the SQL database.
- Use Quick Database Diagrams to save a copy of the schema and the
crowdfunding_db_schema.sql
file. - Create the
crowdfunding_db
database and the 4 tables using the schema. - Load each CSV into the SQL database to complete the database creation.
This project is licensed under the terms of the GNU General Public License v3.0. For more details, see the LICENSE file.