Skip to content

giulic3/data-engineering-nanodegree

Repository files navigation

data-engineering-nanodegree

This is a collection of the projects realized following the syllabus of the Data Engineering Nanodegree offered by Udacity (https://www.udacity.com/course/data-engineer-nanodegree--nd027).

Course overview and projects

The course is divided into 4 blocks of lessons, each block consists of a theoretical introduction on various topics, a series of demos for hands-on practice on the explained concepts and one (or two) projects:

1. Data Modeling

  • Introduction to Data Modeling
  • Relational Data Models
  • [Proj1]: Data Modeling with Postgres
  • NoSQL Data Models
  • [Proj2]: Data Modeling with Apache Cassandra

2. Cloud Data Warehouses

  • Introduction to Data Warehouses
  • Introduction to Cloud Computing and AWS
  • Implementing Data Warehouses on AWS
  • [Proj3]: Data Warehouse

3. Data Lakes with Spark

  • The Power of Spark
  • Data Wrangling with Spark
  • Debugging and Optimization
  • Introduction to Data Lakes
  • [Proj4]: Data Lake

4. Data Pipelines with Airflow

  • Data Pipelines
  • Data Quality
  • Production Data Pipelines
  • [Proj5]: Data Pipelines

5. Bonus: [CapstoneProject] - ####