Netflix Final Project

Rebecca Buerger, Jack Ewert, Nicholas Litterio, Waddah Moghram, Dhruv Vyas, Yuchen Zhang

Abstract

Over the last decade, Netflix has cemented its position as a leading streaming media provider. To maintain its dominance, Netflix has commissioned a one-million-dollar prize in 2009 for the code that improved rating predictions for previously collected real-life customer. Our team has been tasked with the same task for two months. As of the end of the first month, we successfully read and visualized the original dataset provided by Netflix and identified strategies to proceed with the project. Some of these strategies included K-means clustering and Pearsons’ R correlation. Approaching the conclusion of our project, we were able to supplement about 60% of the movie titles with IMDB online database. In addition, we included some time-series analysis of movie and user trends. This paper has been submitted as part of a class entitled Big Data Analytics (IE:4172) on December 7, 2018.

Please refer to the included report in BigDataNetFlixProjectFinalReport_Group_3_Alpaca.pdf for more details

Please note that some data files could not be uploaded to GitHub due to the size limit of 250 MB allowed by server. However, these files are available by request and the result-producing code can be obtained by running the existing data files and source files if that the needed python libraries are installed properly.

Update: the GitHub repository URL mentioned in the report is no longer available. The updated URL is: https://github.com/waddahmoghram/BigDataNetflixProject2018

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
docs		docs
plots		plots
sample_codes		sample_codes
src_IMDB_supplement		src_IMDB_supplement
src_clustering		src_clustering
src_hierarchical		src_hierarchical
src_naive_approach		src_naive_approach
src_plots		src_plots
src_user_clusters		src_user_clusters
trial_codes		trial_codes
venv		venv
venvAnaconda		venvAnaconda
venvAnaconda2		venvAnaconda2
.gitignore		.gitignore
BigDataNetFlixProjectFinalReport_Group_3_Alpaca.pdf		BigDataNetFlixProjectFinalReport_Group_3_Alpaca.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Netflix Final Project

Abstract

About

Releases

Packages

Languages

waddahmoghram/BigDataNetflixProject2018

Folders and files

Latest commit

History

Repository files navigation

Netflix Final Project

Abstract

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages