Skip to content

Latest commit

 

History

History
81 lines (58 loc) · 4.07 KB

README.md

File metadata and controls

81 lines (58 loc) · 4.07 KB

Airline-Data-Ingestion-Project


Project Overview

This project is an overview of an Event Driven Sales Data Projection data pipeline that Process the Orders data based on their Status and route towards DynamoDB or SQS as per the Business requirement rules. An airline daily data ingestion project using S3, S3 Cloudtrail Notification, Event Bridge Pattern Rule, Glue Crawler, Glue Visual ETL, SNS, Redshift, and Step Function


Architectural Diagram

AirlineProject


Key Steps

1. Create a S3 bucket

  • we will create a S3 bucket "airline-data-input" to store the airport dimension file and daily input files. S3

2. Create a Schema in Redshift

  • we will create a "airlines" schema in Redshift with both the Tables.

    • airport_dim
    • daily_flights_fact
  • Copy the aiports data from S3 to Reshift aiport_dim table.

Redshift

3. Create Glue Crawlers for Redshift Tables

  • we will create glue crawlers for the Redshift tables. RedShift_Crawlers

4. Create Glue Crawlers for S3 Input data

  • first we will create a dummy hive style folder in our S3 and upload a dummy data file to create the crawler.
  • S3 Data Input File S3-crawler
  • S3 Glue Crawler S3-flihgt-data
  • Glue Tables glue-tables

5. Create Glue ETL Pipeline

  • we will create a Glue Pipeline "flight-data-ingestion-pipeline" in which we join our daily data with the airport-dim and create a denormalized table for further analysis. glue-pipeline

6. Create SNS

  • we will create a SNS Topic "glue-job-notification" and subscribed by your email address to get the notification. SNS

7. Create State Machine using Step Function

  • we will create a State Machine using Step Function service and create a complete workflow in it. StateMachine

8. Create an Event Rule

  • we will create an Event Rule "Airline-S3-StepFunc-EventRule" Which will triger the State Machine on a Object Creation in airline data S3 bucket. EventRule

9. Create an Event Bridge Rule

  • we will create an Event Rule "Airline-S3-StepFunc-EventRule" Which will triger the State Machine on a Object Creation in airline data S3 bucket. EventRule

9. Upload the input flight data csv file

  • we will upload the input flight data csv file in the input S3 bucket which will trigger the State machine using Event Bridge Rule.

  • S3 file fileUpload

  • Glue Job Run glueJobRun

  • State Machine Run StateMachineSuccess

9. Output

  • we can see the Success Notification in your subscribed mail box mailNoti

  • we can the output data stored in Redshift fact table. factData