Skip to content
This repository has been archived by the owner on Jul 16, 2022. It is now read-only.

Latest commit

 

History

History
21 lines (15 loc) · 914 Bytes

README.md

File metadata and controls

21 lines (15 loc) · 914 Bytes

BloomFilter-MapReduce

Project developed for the Cloud Computing course of the Master of Artificial Intelligence and Data Engineering at the University of Pisa.

This project consists in the design and implementation of a Bloom Filter for IMDb datasets using MapReduce (Hadoop and Spark frameworks).

Repository

The repository is organized as follows:

  • dataset/ contains the IMDb dataset stored in film_ratings.txt
  • docs/ contains the report and the assignment
  • hadoop/ contains the Hadoop implementation and test
  • results/ contains testing results and analysis
  • spark/ contains the Spark implementation and test

Contributors