Analyze personal Netflix usage
This project aims to analyze Netflix usage from the first day of subscription (April 2024) until the end of the year.
- what are the most watched movie category?
- how often did I watch movies per week?
- what is the most engaged movie genre?
In order to reach the goal these are the steps to follow to develop the system:
- ingest data from Netflix
- for each movie, extract the details (via the TMDB API)
- create the star schema
- Netflix data: the dataset extracted at the end of 2024 is related to the movies streamed on the platform (title, date of stream)
- TMDB API: extract movie details (release date, genre, ...)
- Google Cloud Storage: store the RAW data (data lake)
- Google BigQuery: load raw data and transform (star schema)
- DIM_MOVIE: ID_MOVIE (PK)
- DIM_GENRE: ID_GENRE (PK)
- FCT_STREAMING: FCT_STREAMING_ID (PK)
- BRIDGE_MOVIE_GENRE: ID_MOVIE (FK), ID_GENRE (FK)
The outcome of the star schema is a layer that is read from this dashboard on Tableau.