Skip to content

This project is focused on building a movie recommendation system using the MovieLens dataset. The system leverages several machine learning techniques to provide personalized movie recommendations based on user preferences and past behaviors.

Notifications You must be signed in to change notification settings

Medkallel/Movie-Recommendation-System

 
 

Repository files navigation

🎬 Movie Recommendation System with MovieLens Dataset

47a98dbfccf5b666c95f24dc9e4aaf76


Table of Contents


Technologies Used

Python Pandas NumPy Scikit-Learn LightFM Matplotlib Jupyter Notebook


Description

This project is focused on building a movie recommendation system using the MovieLens dataset. The system leverages several machine learning techniques to provide personalized movie recommendations based on user preferences and past behaviors.

Objectives

The main objective of this project is to develop and evaluate different recommendation algorithms, including collaborative filtering, matrix factorization, and hybrid approaches, using the MovieLens dataset. The specific steps include:

  1. Data Preprocessing: Filtering and preparing the dataset for analysis.
  2. Exploratory Data Analysis (EDA): Understanding the dataset and its underlying patterns.
  3. Modeling: Implementing various models like Pearson correlation, SVD, and LightFM for recommendations.
  4. Evaluation: Assessing the performance of the models to identify the most effective approach.

Presentation

A presentation is available as a PDF file in the repo Movie_Recommendation_System_Presentation.pdf & also as a Canva/Powerpoint presentation through the following link: Presentation Link.


Notebooks Overview

  1. Dataframe_Filter.ipynb:

    • This notebook is essential for preparing the dataset. It filters the raw data and generates a CSV file that is necessary for the subsequent models.
    • Important: You must run this notebook first to create the CSV file that will be used by the Pearson, LightFM, and SVD models.
  2. Exploratory_Data_Analysis.ipynb:

    • Provides a comprehensive analysis of the dataset, including visualizations and insights into user ratings, movie genres, and other key aspects.
  3. NLP_Vectorizing.ipynb:

    • Applies Natural Language Processing (NLP) techniques to vectorize textual data (e.g., movie descriptions) for use in hybrid recommendation models.
  4. Pearson_Correlation.ipynb:

    • Implements a collaborative filtering model using Pearson correlation to recommend movies based on user similarity.
  5. SVD.ipynb:

    • Uses Singular Value Decomposition (SVD), a matrix factorization technique, to predict user ratings for movies.
  6. New_Model_LightFM.ipynb:

    • Develops a hybrid model using the LightFM library, combining both content-based and collaborative filtering approaches for recommendations.

Important

The project was developed and tested on Python 3.11.6

To run this project locally, follow these steps:

  1. Clone the repository:
git clone https://github.com/jcrigoni/grand_ml_project
cd Movie-Recommendation-System
  1. Install requirements:
pip install -r requirements.txt

Important

LightFM needs OpenMp to run multithreading which could be difficult on Windows or Macos. In that case it's better to use the docker version of LightFM.


Usage

  1. Run the Dataframe_Filter.ipynb notebook to create the necessary CSV file and used movie.csv and rating.csv on https://www.kaggle.com/datasets/grouplens/movielens-20m-dataset .
  2. After running the first notebook, you can proceed to run the other notebooks to explore the data, build models, and generate recommendations.

TIP: Some notebooks may take a while to run depending on the dataset size and complexity of the model. Please be patient!


Project structure

📦 grand_ml_project/
├── 📁Data/
│   ├── 🐍Dataframe_Filter.ipynb
│   ├── 🐍Exploratory_Data_Analysis.ipynb
│   └── 🐍NLP_Vectorizing.ipynb
├── 📁Models/
│   ├── 🐍New_Model_LightFM.ipynb
│   ├── 🐍Pearson_Correlation.ipynb
│   ├── 🐍SVD.ipynb
│   ├── 🖼️banner.png
│   └── 📁Exported_Models/
│       └── 🗃️lightfm_recommendation_model.pkl
├── 📄requirements.txt
├── 📄README.md
├── 📄Project-Documentation_Movie_Recommendation_System_Kallel_Rigoni_Rodner.pdf
├── 📄Movie_Recommendation_System_Presentation.pdf
└── 📄.gitignore

Colaborators

This project was developed by a collaborative team. Each member played a crucial role in the research, development, and analysis:

  • Mohamed Kallel
  • Jean Christophe Rigoni
  • Simon Pierre Rodner

📫 Contact me

LinkedIn


License

This project is under the CC BY-NC 4.0 License. For more information, refer to the license file.
License: CC BY-NC 4.0

About

This project is focused on building a movie recommendation system using the MovieLens dataset. The system leverages several machine learning techniques to provide personalized movie recommendations based on user preferences and past behaviors.

Topics

Resources

Stars

Watchers

Forks

Languages

  • Jupyter Notebook 100.0%