DeepLearning.AI Data Engineering Specialization 🌟

Welcome to my repository for the DeepLearning.AI's Data Engineering Professional Certificate! This repo contains code, quizzes, and personal notes from the specialization, showcasing my journey in mastering data engineering concepts and tools.

📚 Overview

The Data Engineering Specialization is a comprehensive program designed to equip learners with the skills needed to design, build, and manage data pipelines and architectures. This repository documents my hands-on experience with the course material.

📑 Table of Contents

Courses

Course 1: Introduction to Data Engineering

Key Topics:
- Data engineering lifecycle and undercurrents
- Designing data architectures on AWS
- Implementing batch and streaming pipelines
Content:
- Notes on requirements gathering and stakeholder collaboration
- Code samples for batch and streaming pipelines
- Architecture diagrams and design considerations

Course 2: Data Ingestion and DataOps

Key Topics:
- Working with source systems (relational and NoSQL databases)
- Data ingestion techniques (batch and streaming)
- DataOps practices (CI/CD, Infrastructure as Code, data quality)
Content:
- Scripts for data ingestion from APIs and message queues
- Terraform configurations for AWS resources
- Airflow DAGs for orchestrating data pipelines
- Data quality tests using Great Expectations

Course 3: Data Storage and Retrieval

Key Topics:
- Storage systems (object, block, file storage)
- Data lake and data warehouse architectures
- Query optimization and performance tuning
Content:
- Implementations of data lakehouse architectures
- Advanced SQL queries and performance comparisons
- Notes on storage formats and indexing strategies

Course 4: Data Modeling and Transformation

Key Topics:
- Data modeling techniques (normalization, star schema, data vault)
- Transformations for analytics and machine learning
- Batch and streaming data processing
Content:
- Data models and schemas for different use cases
- PySpark code for data transformations
- Preprocessing pipelines for machine learning datasets

🛠 Skills Developed

Data Architecture Design
Data Ingestion Techniques
DataOps Practices
Data Storage and Retrieval
Data Modeling
Data Transformation and Orchestration

🔧 Technologies Used

Programming Languages: Python, SQL
Cloud Platforms: AWS
Data Processing Frameworks: Apache Spark, PySpark, Pandas
Orchestration Tools: Apache Airflow
Infrastructure as Code: Terraform
Data Quality Tools: Great Expectations
Databases: MySQL, PostgreSQL, MongoDB, Amazon S3
Others: REST APIs, Message Queues, Streaming Platforms

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

📫 Contact

Feel free to reach out via LinkedIn or email for any questions or collaborations!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
c1_introduction-to-data-engineering		c1_introduction-to-data-engineering
c2_source-systems-data-ingestion-and-pipelines		c2_source-systems-data-ingestion-and-pipelines
c3_data-storage-and-queries		c3_data-storage-and-queries
c4_data-modeling-transformation-and-serving		c4_data-modeling-transformation-and-serving
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepLearning.AI Data Engineering Specialization 🌟

📚 Overview

📑 Table of Contents

Courses

Course 1: Introduction to Data Engineering

Course 2: Data Ingestion and DataOps

Course 3: Data Storage and Retrieval

Course 4: Data Modeling and Transformation

🛠 Skills Developed

🔧 Technologies Used

📄 License

📫 Contact

About

Releases

Packages

Languages

License

ConnorBritain/deeplearning_data_engineering

Folders and files

Latest commit

History

Repository files navigation

DeepLearning.AI Data Engineering Specialization 🌟

📚 Overview

📑 Table of Contents

Courses

Course 1: Introduction to Data Engineering

Course 2: Data Ingestion and DataOps

Course 3: Data Storage and Retrieval

Course 4: Data Modeling and Transformation

🛠 Skills Developed

🔧 Technologies Used

📄 License

📫 Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages