Skip to content

This repository contains the examples and exercises of the course Distributed Architectures for Big Data Processing and Analytics

Notifications You must be signed in to change notification settings

arcangeloC-137/distributed_architectures_for_big_data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Distributed Architectures for Big Data Processing and Analytics

This repository contains the examples and exercises of the course Distributed Architectures for Big Data Processing and Analytics, for the Data Science and Engineering course at Politecnico di Torino. The course is mainly based on Hadoop Map Reduce techniques and an introduction to the Apache Spark framework.

The exercises and examples contain the following topics:

  • Introduction to Apache Spark;
  • RDD-based programs;
  • Spark SQL and DataFrames;
  • Data mining and Machine learning algorithms with Spark MLlib;
  • GraphX/GraphFrames;
  • Streaming data analytics.

A.Y. 2021/22

About

This repository contains the examples and exercises of the course Distributed Architectures for Big Data Processing and Analytics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published