The aim of this project is to create an automated pipeline that takes in data from Boston Crime Data (https://data.boston.gov/dataset/crime-incident-reports-august-2015-to-date-source-new-system) data. It then performs the appropriate transformations and loads the data into SQL Server database.
• Write a function to extract Boston Crime Data Files.
• Transform the data while maintaining the control number.
• Load the data into SQL Server.
• Maintained a log file containing timestamps for every aspect of ETL.
• Paste the link from the website.
• Enter Server name, DB name, and the necessary details required to create a folder/connect to SQL Server.
• Simply run all the cells to perform ETL.
• Resolved data quality issues by minimizing redundancy, disparity, and errors.
• Acquired valuable insights and experience in automating data ingestion and data quality validation.
• Include data from various sources.
• Carry out analysis of the transformed data on Power BI (https://github.com/saran820/Boston-Crime-Data-Analysis-Report).