A Recommender System based on Semantic Space Clustering.
The MovieLens 1M database saved in the data directory only contains the movie names, so to gather the plots, genres and reviews you must first extract it from IMDb by running:
python datahandler/imdb_extractor.py
Once fetched the info, you can now run the Recommender System:
python3 main.py
- Gensim: Library that implements the Paragraph Vector algorithm.
- IMDbPy: Used for searching on IMDb.
- ImdbPie: Used for retrieving reviews from IMDb.
- NumPy: Normal computing.
- Scikit-Learn: Used to normalize the DensityPeakCluster decision graph, allowing to automatically choose the density and distance threshold.
- Matplotlib: Used only if you want to plot the clusters for debugging.