An AI Content Based Recommender System recommends movies similar to the movie the user picks making some sentiment analysis on that movie and fetches the top similar group of movies of at least 10.
The details of the movies(title
, genre
, cast
, crew
, runtime
, release_data
, rating
, poster
, etc..) are fetched using an API by TMDB
, https://www.themoviedb.org/documentation/api, and by using the IMDB id of the movie in the API, the movie posters are fetched and displayed along with each movie title.
How does it decide which item is most similar to the item user likes? Here come the similarity scores.
It is a numerical value ranges between zero to one which helps to determine how much two items are similar to each other on a scale of zero to one. This similarity score is obtained measuring the similarity between the text details of both of the items. So, similarity score is the measure of similarity between given text details of two items. This can be done by cosine-similarity.
Cosine similarity is a metric used to measure how similar the documents are irrespective of their size. Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space. The cosine similarity is advantageous because even if the two similar documents are far apart by the Euclidean distance (due to the size of the document), chances are they may still be oriented closer together. The smaller the angle, higher the cosine similarity.
Create an account in https://www.themoviedb.org/, click on the API
link from the left hand sidebar in your account settings and fill all the details to apply for API key. If you are asked for the website URL, just give "NA" if you don't have one. You will see the API key in your API
sidebar once your request is approved.
- Clone or download this repository to your local machine.
- Install
numpy>=1.9.2
,nltk==3.5
,scikit-learn>=0.18
,pandas>=0.19
,requests==2.23.0
,streamlit==1.2.0
. - Run the Movies-AI-Content-Based-Recommender.ipynb first to generate the 2 pkl files.
- Get your API key from https://www.themoviedb.org/. (Refer the above section on how to get the API key)
- Replace YOUR_API_KEY in
interface.py
. - Open your terminal/command prompt from your project directory and type
streamlit run interface.py
. - The App will open in the default browser. If not Go to your browser and type
http://127.0.0.1:5000/
in the address bar.
- TMDB 5000 Movie Dataset
- Understanding the Math behind Cosine Similarity
- NLP Text Feature Extraction
- Streamlit Docs
- List of 2020-2021 movies
- KAN Org.
- University of Ain Shams, Egypt