Skip to content

This repository showcases my project on analyzing the TMDB movies dataset as part of Udacity's Data Analyst Nanodegree. The project focuses on investigating trends and patterns in movie data to derive insights into the film industry.

Notifications You must be signed in to change notification settings

noora-a/TMDB-Movies-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

TMDB-Movies-Project

Udacity- Data Analyst Nanodegree

Installations

You will use the Python libraries,

  • NumPy
  • Pandas
  • Matplotlib

Project Overview:

In this project, We will analyze a dataset,TMDB Movie Data in this case, and then communicate our findings about it.

  • Data Wrangling: Employed Python libraries to clean and organize the TMDB movie dataset.
  • Exploratory Analysis: Conducted thorough exploratory analysis to uncover underlying patterns in the data.
  • Visualization: Created compelling visualizations to present findings on movie popularity, ratings, and revenue trends.

Dataset- TMDB Movie

This data set contains information about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue.

Certain columns, like ‘cast’ and ‘genres’, contain multiple values separated by pipe (|) characters. The final two columns ending with “_adj” show the budget and revenue of the associated movie in terms of 2010 dollars, accounting for inflation over time.

##Analysis of the Dataset Think of some questions to which you want to find the answers to or Brainstorm some questions that could be answered using the data set that is chosen, then start answering those questions. Try and suggest questions that promote looking at relationships between multiple variables.

Conclusion

What I learned:

  • What all steps are involved in a typical data analysis process.
  • Comfortable posing questions that can be answered with a given dataset and then answering those questions.
  • Know how to investigate problems in a dataset and wrangle the data into a format that can be used.
  • Have practice communicating the results of the analysis.
  • Being able to use vectorized operations in NumPy and Pandas to speed up your data analysis code.
  • Being familiar with Pandas Series and DataFrame objects, which lets access data more conveniently.
  • Last but not least, Know how to use Matplotlib and Seaborn to produce plots showing findings.

About

This repository showcases my project on analyzing the TMDB movies dataset as part of Udacity's Data Analyst Nanodegree. The project focuses on investigating trends and patterns in movie data to derive insights into the film industry.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published