Name		Name	Last commit message	Last commit date
parent directory ..
Code		Code
Labs		Labs
Lessons		Lessons
README.md		README.md

README.md

Module 5 - Data Science and Machine Learning

The module introduces the fundamental concepts of data science and machine learning using Spark and Spark Machine Learning library. Thus, at the end of the course, students should know the fundamental concepts of machine learning and be adapt Spark for machine learning and data science to predict the trend and patterns of massive data sets.

Lesson	Title	Lab	Objectives
1	Data Science and Machine Learning Overview		The basics of data science. Machine learning basics. Machine learning feature representation and modeling.
2	Spark MLlib and Data Types		The fundamentals of Spark. Major components of Spark programming. How Spark Machine learning library works. Spark data types.
3	Spark ML Overview and Spark on Azure	Lab	Recognize the Spark ML API. Demonstrate how a Spark Cluster is configured on top of HDInsight Cluster. Explain some features available in Azure HDInsight Spark Clusters.
4	Spark MLlib Basic Statistics	Lab	How to use basic statistics functions provided by MLlib. The input data types for these functions. How data types affect the functionality of the statistical methods.
5	Clustering	Lab	Understand what a clustering algorithm does. Understand supervised and unsupervised learning. Recognize the K-Means algorithm. Run K-Means on Spark MLlib.
6	Regression		Formally define the regression model. Define how to model using simple linear regression. Understand how to model using linear regression. Understand overfitting and underfitting the model. Understand what a regularization term accomplishes.
7	Regression and Classification	Lab	Explain what regularizers accomplish. Understand cross-validation procedures. Understand nested cross-validation procedures. Define a classification problem. Represent classification errors. Explain loss functions. Understand logistic regression. Utilize Spark MLlib to implement logistic regression.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Module5

Module5

README.md

Module 5 - Data Science and Machine Learning

Files

Module5

Directory actions

More options

Directory actions

More options

Latest commit

History

Module5

Folders and files

parent directory

README.md

Module 5 - Data Science and Machine Learning