Job Market Trends Analysis

An enriched data mart to analyze job market trends from 2021 to 2023 in several countries through conceptual design, physical design and data staging, OLAP queries, BI dashboard creation, and data mining.

Setup Instructions

Local System

Install Python 3.x and Docker Engine (Docker Desktop)
- Open Docker Desktop and leave it open. This keeps Docker Engine running for you to run Docker commands

Python Environment

# Create a virtual environment and install Python dependencies
python -m venv venv
source venv/Scripts/activate  # Windows (git bash)
source venv/bin/activate      # UNIX

# Install all dependencies
pip install -r requirements.txt

Database Instance

Make sure port 5432 is available
- Stop the local Postgres service if it is running on your system
Create a file to store sensitive values, such as passwords
- Create a file named .env in the root of the directory
- Open the file .env.examples
- Copy the contents of .env.examples and paste it into .env
- Replace the values with your own values
Pull Docker images and run the containers
- docker compose up --build -d to build the images and run the containers in the background
- docker ps to verify that your containers are started
- docker compose down to stop your running containers
- docker system prune -a to delete all stopped images and containers

Populate Database

Now that the database instance and the schema are created, the db needs to be populated

python db/db.py populates all tables with data, including measurements

Access the Database

Instructions to interact with the Postgres database instance in the Docker container.

docker exec -it postgres bash to enter the postgres container
psql -U postgres -d postgres to interact with the PostgreSQL database in the container
Refresher on some PSQL commands to get started

\dt                                    # view all tables
psql -U postgres -d postgres           # open the interactive terminal for the 'postgres' database as the 'postgres' user
SELECT * FROM job_posting_dim;         # view all records in the job_posting_dim table
SELECT COUNT(*) FROM job_posting_dim;  # to count the number of rows

Data Staging

Our data staging code is in CSI4142_DataStaging_Group8.ipynb in the data_staging folder. If you want to run and test it:

Download the notebook from data_staging folder
Download the first dataset from this link: https://www.kaggle.com/datasets/ravindrasinghrana/job-description-dataset?resource=download
Download CityPopulation.csv from data_staging folder
Download CompanyInformation.csv from data_staging folder

Please make sure you have Python, pandas and jupyter notebook installed. Alternatively, you can also test using our Google Colab with our code: https://colab.research.google.com/drive/1rAs09BBjjFzvePcJQj585K-PU1yV-4Fb?usp=sharing

# Create a virtual environment if you have not created one already
python -m venv venv
source venv/Scripts/activate  # Windows git bash
source venv/bin/activate      # UNIX

# Install dependencies
pip install pandas
pip install notebook

Design Process

Obtain and load the dataset
- The original dataset was obtain from Kaggle https://www.kaggle.com/datasets/ravindrasinghrana/job-description-dataset
Conceptual Design
- Planning and design of Fact table and Dimension tables
Data Staging
- Identify and correct errors or missing values in the data
Physical Design
- Insert the data into a RDBMS (Postgres) and optimize the data for OLAP queries
- Define aggregations and measurements for analysis
Data Visualization (OLAP queries and BI dashboard)
- Generate standard OLAP operations
- Generate explorative SQL operations
- Create a BI dashboard to explore and visualize trends in the data
Data Mining
- Leverage ML techniques to answer relevant questions regarding job market trends

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
data_staging		data_staging
db		db
.env.examples		.env.examples
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Job Market Trends Analysis

Setup Instructions

Local System

Python Environment

Database Instance

Populate Database

Access the Database

Data Staging

Design Process

About

Releases

Packages

Contributors 3

Languages

kienmarkdo/Job-Market-Trends-Analysis

Folders and files

Latest commit

History

Repository files navigation

Job Market Trends Analysis

Setup Instructions

Local System

Python Environment

Database Instance

Populate Database

Access the Database

Data Staging

Design Process

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages