Designed a cloud-based AWS ETL pipeline to analyze birthday data, automating ingestion, feature engineering, and storage using S3, Lambda, and RDS, with insights visualized via a Streamlit dashboard.
- Raw data is ingested and stored in Amazon S3
- AWS Lambda functions automate ingestion and trigger ETL logic
- Data is cleaned, transformed, and feature-engineered using Python in a Lambda function.
- Processed data is loaded into Amazon RDS
- An EC2-hosted Streamlit dashboard queries RDS for analytics and visualization
The demo shows:
- Data ingestion via AWS services
- ETL pipeline execution
- Data storage and schema in RDS
- Streamlit dashboard consuming processed data
Due to cloud cost and security considerations, AWS resources are not kept live. This repository focuses on pipeline logic, architecture, and system design rather than deployment.