Provisionally Setup

These are the containers I'm running:

React (Frontend) Container
Spring Boot (Backend) Container
MySQL (preloaded Database) Container
Elasticsearch (SearchEngine) Container
MinIO (preloaded FileStorage) Container
Traefik (reverse proxy) Container

For CI / CD I use GitHub Workflows.

Set Up MySQL Database

I created an entity-relationship diagram to simplify schema creation. I installed mysql-server on my machine, processed the dataset and imported it into the database.

Process Movies / Rating Datasets

For this I used the powerful capabilities of the Python framework Pandas which can easily process big datasets. All steps are verifiable through a jupyter notebook.

download title.basics.tsv.gz and title.ratings.tsv.gz from IMDb
process dataset using Python, Pandas, Numpy:
- replace empty values by '\N'
- remove incorrect values (consistent datatype per column)
- merge Rating-, Movie- and image/description dataframes
- set tconst as index

Instead of rerunning the jupyter notebook you can also just download the Processed Dataset.

Create Database Tables and import data

execute create table statements and load infile using init.sql file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Provisionally Setup

Set Up MySQL Database

Process Movies / Rating Datasets

Create Database Tables and import data

Files

README.md

Latest commit

History

README.md

File metadata and controls

Provisionally Setup

Set Up MySQL Database

Process Movies / Rating Datasets

Create Database Tables and import data