In this track, you’ll discover how to build an effective data architecture, streamline data processing, and maintain large-scale data systems. In addition to working with Python, you’ll also grow your language skills as you work with Shell, SQL, and Scala, to create data engineering pipelines, automate common file system tasks, and build a high-performance database.
Through hands-on exercises, you’ll add cloud and big data tools such as AWS Boto, PySpark, Spark SQL, and MongoDB, to your data engineering toolkit to help you create and query databases, wrangle data, and configure schedules to run your pipelines. By the end of this track, you’ll have mastered the critical database, scripting, and process skills you need to progress your career.
- Data Engineering for Everyone
- Introduction to Data Engineering
- Streamlined Data Ingestion with pandas
- Writing Efficient Python Code
- Writing Functions in Python
- Introduction to Shell
- Data Processing in Shell
- Introduction to Bash Scripting
- Unit Testing for Data Science in Python
- Object-Oriented Programming in Python
- Introduction to Airflow in Python
- Introduction to PySpark
- Building Data Engineering Pipelines in Python
- Introduction to AWS Boto in Python
- Introduction to Relational Databases in SQL
- Database Design
- Introduction to Scala
- Big Data Fundamentals with PySpark
- Cleaning Data with PySpark
- Introduction to Spark SQL in Python
- Cleaning Data in SQL Server databases
- Transactions and Error Handling in SQL Server
- Building and Optimizing Triggers in SQL Server
- Improving Query Performance in SQL Server
- Introduction to MongoDB in Python