This repository covers all project materials of University of Pennsylvania CIS 545 Big Data Analytics.
Project 1 - Data Wrangling.ipynb: use Pandas for data manipulation and analysis.
Project 2 - Database Manipulation.ipynb: use pandasql and spark for graph data and traversing relationships.
Project 3 - Spark SQL.ipynb: use Spark with an EMR cluster to manipulate LinkedIn and stock data.
Project 4 - Machine Leaning.ipynb: use sklearn for machine learning and Spark ML for scalable machine learning.
Project 5 - Deep Learning.ipynb: use MxNet to build neural networks for image recognition.