This project demonstrates the creation of a data pipeline using AWS services, including CloudFormation, Aurora MySQL, Step Functions, Lambda, S3, Athena, and QuickSight. The primary goal was to learn the fundamentals of setting up and managing a data pipeline on AWS. To read more about this project, check out my blog post Building My First AWS Batch Data Pipeline: A Hands-On Journey with CloudFormation, Step Functions, and More.
- Provisioned an S3 data lake bucket, Aurora MySQL instance, and a Step Function using CloudFormation.
- Focused on deploying these resources as a stack to understand IaC and AWS resource management.
- Created a batch processing pipeline with a Step Function orchestrating a Lambda function.
- Exported data from MySQL to S3, demonstrating the ETL process.
- Defined Athena tables for the S3 data schema.
- Connected Athena tables to QuickSight for data visualization.
- The importance of detailed architecture and permissions management.
- Practical experience with various AWS services and IaC.
- Develop expertise in pipeline design and event-driven architectures.
- Enhance understanding of IAM roles and permissions for smoother deployments.