I'm a Data Engineer & Senior Data Analyst with 6+ years of experience designing scalable ETL/ELT pipelines, building modern data lakehouse architectures, and implementing cloud-based data platforms.
My expertise spans GCP, AWS, Azure Fabric, and large-scale analytics engineering.
I am passionate about solving data quality challenges, automation, and building reliable, high-performing data systems.
π» Programming: Python β’ SQL β’ Unix/Shell
βοΈ Cloud Platforms: Google Cloud (BigQuery, Pub/Sub, Dataproc, Dataflow) β’ AWS (S3, EMR, Glue, Athena, Redshift) β’ Azure Fabric β’ Snowflake β’ SSIS β’ Alteryx
ποΈ Databases: MySQL β’ PostgreSQL β’ SQL Server β’ Teradata β’ MongoDB
π Reporting / Visualization: Power BI β’ Tableau β’ SSRS β’ MS Excel β’ Matplotlib β’ Plotly β’ Seaborn
βοΈ Frameworks & Tools: Apache Spark β’ dbt β’ Airflow β’ Hadoop β’ Flask β’ Soda β’ Great Expectations
π CI/CD & DevOps: GitHub β’ GitHub Actions β’ Docker β’ Kubernetes β’ Terraform
- DataEngineering-portfolio β End-to-end cloud ETL/ELT architectures, pipeline designs, & reusable components
- Airflow Retail Pipeline (BigQuery + dbt + Soda) β Retail analytics pipeline with automated data quality
- dashboard-portfolio β Power BI, Excel & Tableau dashboards
- Impact of Covid-19 on Digital Learning β Storytelling visualization project using python, matplotlib, seaborn
- Clustering Profiles using Data Quality Metrics β ML-based profiling for assessing data trust
- Improving Data Quality Metrics β Framework covering completeness, validity, and uniqueness
- Case Study on Data Governance β Governance policies, lineage models, and stewardship
- CI-CD-portfolio β GitHub Actions, Docker, and Kubernetes automation for pipelines
- Big Mart Sales Prediction β Regression models with feature engineering
- π Google Cloud Professional Data Engineer
- π Microsoft PL-300: Power BI Data Analyst
- π Microsoft DP-600: Fabric Analytics Engineer
Letβs collaborate on Data Engineering, Cloud, Data Quality, and Modern Analytics projects!