Udacity Data Engineering with AWS Nanodegree

Design data models, build data warehouses and data lakes, automate data pipelines, and manage massive datasets.

Create user-friendly Relational and NoSQL data models
Create scalable and efficient data warehouses
Work efficiently with massive datasets
Build and interact with a cloud-based data lake
Automate and monitor data pipelines
Develop proficiency in Spark, Airflow, and AWS tools

Course 1 - Data Modeling

Create Relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.

Lessons

Introduction to Data Modeling
Relational Data Models
NoSQL Data Models

Projects

Course 2 - Cloud Data Warehouses

Create cloud-based data warehouses. Sharpen data warehousing skills, deepen understanding of data infrastructure, and be introduced to data engineering on the cloud using Amazon Web Services (AWS).

Lessons

Introduction to Data Warehouses
ELT and Data Warehouse Technology in the Cloud
AWS Data Technologies
Implementing Data Warehouses on AWS

Project

Project 2 - Cloud Data Warehouse

Course 3 - Spark & Data Lakes

Build a data lake on AWS and a data catalog following the principles of data lakehouse architecture. Learn about the big data ecosystem and the power of Apache Spark for data wrangling and transformation. Work with AWS data tools and services to extract, load, process, query, and transform semi-structured data in data lakes.

Lessons

Big Data Ecosystem, Data Lakes, & Spark
Spark Essentials
Using Spark & Data Lakes in the AWS Cloud
Ingesting & organizing data in lakehouse architecture on AWS

Project

Project 3 - STEDI Human Balance Analytics

Course 4 - Automoate Data Pipelines

Dive into the concept of data pipelines.

Focus on applying the data pipeline concepts learn through Apache Airflow - concepts covered including data validation, DAGs, and Airflow.
Venture into AWS quality concepts like copying S3 data, connections and hooks, and Redshift Serverless.
Explore data quality through data lineage, data pipeline schedules, and data partitioning.
Put data pipelines into production by extending Airflow with plugins, implementing task boundaries, and refactoring DAGs.

Lessons

Data Pipelines
Airflow & AWS
Data Quality
Production Data Pipelines

Project

Project 4 - Data Pipelines with Airflow

Program Syllabus, more information about this program can be found by visiting Udacity Data Engineering ND.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
Course 1-Data Modeling		Course 1-Data Modeling
Course 2-Cloud Data Warehouses		Course 2-Cloud Data Warehouses
Course 3-Spark and Data Lakes		Course 3-Spark and Data Lakes
Course 4-Automate Data Pipelines		Course 4-Automate Data Pipelines
Certificate.png		Certificate.png
Data+Engineering+Nanodegree+Program+Syllabus.pdf		Data+Engineering+Nanodegree+Program+Syllabus.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Udacity Data Engineering with AWS Nanodegree

Course 1 - Data Modeling

Lessons

Projects

Course 2 - Cloud Data Warehouses

Lessons

Project

Course 3 - Spark & Data Lakes

Lessons

Project

Course 4 - Automoate Data Pipelines

Lessons

Project

About

Releases

Packages

Languages

phphoebe/Udacity-Data-Engineering-with-AWS

Folders and files

Latest commit

History

Repository files navigation

Udacity Data Engineering with AWS Nanodegree

Course 1 - Data Modeling

Lessons

Projects

Course 2 - Cloud Data Warehouses

Lessons

Project

Course 3 - Spark & Data Lakes

Lessons

Project

Course 4 - Automoate Data Pipelines

Lessons

Project

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages