Sparkify is a fictional music application that store songs and users' activity logs in separate JSON files. When the application started to grow, it becomes extremely difficult for the company to handle and benefit from these files. The suggested solution is to start investing in cloud solution. In this project Amazon Web services will be used.
Load credentials
- Loading credintials
- Read data from s3 bucket
- Transform data by careting five seprate tables
- Load data to a new s3 bucket
Songs table files are partitioned by year and then artist. Time table files are partitioned by year and month. Songplays table files are partitioned by year and month.
To Run the codes do the following instructions in the same exact order
- Open the terminal or bash in windows
- Write python etl.py then wait until the processing is completed