Contains the solutions to assignments of INF 553: Foundations and Applications of Data Mining (Summer 2020)
Assignment1 -
-> Familiarize with Spark operations(e.g transformations and actions) and Map Reduce
-> Reviews and Businesses datasets used from Yelp
Assignment2 -
-> Implement the SON algorithm using the Apache Spark Framework
-> Develop a program to find frequent itemsets in two datasets, one simulated dataset and one real-world dataset generated from Yelp dataset
Assignment3 -
-> Implement Minhash and LSH
-> Implement Content-Based Recommendation system and Collaborative Filtering Recommendation System using Yelp Review Dataset
Assignment4 -
-> Explore the spark GraphFrames library as well as implement own Girvan-Newman algorithm using the Spark Framework to detect communities in graphs