Distributed-Collaborative-Filtering-Book-Recommendation-System

It uses Dask as a Distributed Framework with Website Application using Streamlit. Inspired by the work of https://github.com/entbappy/ML-Based-Book-Recommender-System

Collaborative Recommendation System

Collaborative filtering systems rely on user-item interactions.
Users with similar ratings form clusters, facilitating the recommendation process.
When recommending books, the system employs a cluster-based mechanism.
The system considers either ratings or comments as its sole parameter.
In essence, collaborative filtering assumes that if one user likes item A and another user likes both item A and another item, B, the first user may also be interested in item B.
Challenges include:
- The computational expense of managing a user-item nXn matrix.
- Preferential recommendation for only popular items.
- Potential neglect of recommending new items.

Data Used

We used the data from Kaggle that contains the Book names, User Ids and their Ratings.
Link to data: https://www.kaggle.com/datasets/arashnic/book-recommendation-dataset?resource=download&select=Ratings.csv

Algorithm Used

In training the Model, we used the KNN algorithm to cluster their ratings and in finding the suitable books to be recommended.

Load the dataset.
Set the value of k.
Iterate through the total number of training data points to obtain the predicted class.
Compute the Euclidean distance between the test data and each row of the training data, as it is a widely used distance metric.
Arrange the calculated distances in ascending order.
Extract the top k rows from the sorted array.

Distributed Framework - Dask

Dask is a parallel computing library designed to seamlessly scale and handle larger-than-memory computations in a distributed environment.

Convert the Pandas DataFrame to a Dask DataFrame
Find the index of the target book in a distributed manner
Compute the distances and suggestions in a distributed manner
Schedule the computation and gather results
Append the Book list into the array

Usage

Clone the Repository

git clone https://github.com/D3struf/Distributed-Collaborative-Filtering-Book-Recommendation-System.git

Open Anaconda Command Prompt and Create a conda environment inside the repository's directory

conda create -n books python=3.7.10 -y

conda activate books

Install the requirements

pip install -r requirements.txt

Now run the app.py

streamlit run app.py

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Exports		Exports
data		data
env		env
src.egg-info		src.egg-info
src		src
Book Recommendations Collaborative Filtering with Distributed Computing.ipynb		Book Recommendations Collaborative Filtering with Distributed Computing.ipynb
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Distributed-Collaborative-Filtering-Book-Recommendation-System

Collaborative Recommendation System

Data Used

Algorithm Used

Distributed Framework - Dask

Usage

Clone the Repository

Open Anaconda Command Prompt and Create a conda environment inside the repository's directory

Install the requirements

Now run the app.py

Collaborators

About

Uh oh!

Uh oh!

Languages

License

D3struf/Distributed-Collaborative-Filtering-Book-Recommendation-System

Folders and files

Latest commit

History

Repository files navigation

Distributed-Collaborative-Filtering-Book-Recommendation-System

Collaborative Recommendation System

Data Used

Algorithm Used

Distributed Framework - Dask

Usage

Clone the Repository

Open Anaconda Command Prompt and Create a conda environment inside the repository's directory

Install the requirements

Now run the app.py

Collaborators

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages