Author: Andy Peng
The contents of this repository detail an analysis of the movie data. This analysis is detailed in hopes of making the work accessible and replicable.
Microsoft tasked us into helping them understand the movie industry better. We need to retreive and explore the data relating to the movie industry. First thing to do is gather as much data relating to the movie industry as we can. Then I want to explore each dataset and see if any are worth joining together. After that I want to explore the relationships between genres, directors, ratings, revenues and runtime of movies.
The following list contains the resources we used to perform our data analysis on the data relating to the movie industry.
- Box Office Mojo
- IMDB
- Rotten Tomatoes
- TheMovieDB.org
- TMDB Kaggle Movie Data
- Descriptive Analysis
- Choices made
- Key relevant findings from exploritory data analysis
To summarize everything above, we can see from above we need to consider the following features when making a movie.
-
Popular Directors - Hiring popular directors lead to a higher average revenue and ratings for movies
-
Genres
-
Animation, adventure, sci-fi, fantasy and action lead to higher revenues
-
Short films, documentary, game shows, news and biographies lead to higher ratings
There are many features that we haven't considered. For example,
- Time the movie is released
- Movie budget
- Actors
- Domestic VS International
Not only can we consider these features in our future exploration, but gathering more data would help further support our claims.
Please review the narrative of our analysis in our jupyter notebook or review our presentation
For any additional questions, please contact andypeng93@gmail.com
Here is where you would describe the structure of your repoistory and its contents, for example:
├── README.md <- The top-level README for reviewers of this project.
├── Movie Project - Andy.ipynb.ipynb <- narrative documentation of analysis in jupyter notebook
├── Module 1 – Movie Project Presentation.pdf <- pdf version of project presentation
└── Images
└── images <- both sourced externally and generated from code