Freight Forecasting Pipeline

Overview

This project provides a comprehensive pipeline for forecasting freight prices using multiple datasets, advanced feature engineering, and a suite of machine learning and time series models. It covers the full workflow from raw data ingestion through preprocessing, feature extraction, model training, and results reporting.

Project Structure

/data/raw/ # Raw data files (Excel, CSV)
/data/processed/ # Cleaned and feature-engineered datasets
/models/ # Model scripts (ARIMA, XGBoost, Lasso, Prophet, etc.)
/reports/models/ # Model evaluation metrics and plots
/utils/ # Utility functions (preprocessing, diagnostics)
/data_pipeline/ # Scripts to fetch and load raw data
/feature_engineering/ # Feature engineering pipeline scripts
/notebooks/ # Jupyter notebooks and demo

Installation

Prerequisites

Python 3.10 or newer
Recommended: Virtual environment (venv or conda)

Setup

Clone the repository:

bash

cd freight-forecasting

python -m venv venv source venv/bin/activate

pip install -r requirements.txt

Usage

CLI Interface

Run any part or all of the pipeline using the CLI interface main_cli.py.

Command-line Arguments

--fetch Fetch all raw data from source files
--prepare Merge and align raw data into a single weekly dataset
--features Perform feature engineering (interpolation, volatility, seasonality)
--train Train and benchmark all predictive models with hyperparameter tuning
--report Generate model comparison dashboards and summary tables

Pipeline Stages

Data Fetching Scripts in data_pipeline/ load raw data from Excel/CSV files, perform initial cleaning, and save intermediate processed CSVs.
Data Preparation Loading and merging datasets, resampling to a consistent weekly Monday frequency, and aligning time series.
Feature Engineering Interpolation of missing values, computation of volatility indicators, and extraction of seasonal/trend components for key variables.
Model Training Multiple models trained and benchmarked, including:
- Auto ARIMA
- SARIMAX with exogenous variables
- Lasso (with and without lags)
- Ridge Regression
- Support Vector Regression (with hyperparameter tuning)
- XGBoost Regression (with hyperparameter tuning)
- Prophet (uni- and multivariate, tuned)
Results Reporting Generation of performance summaries, metrics logs, and comparison plots saved in reports/models/.

Output

Processed Data: data/processed/processed.csv (final merged and feature-engineered dataset)

Model Performance Plots: Saved in reports/models/ as PNG files

Model Comparison Dashboard: Summary plots comparing MAE and R² scores across models

Troubleshooting

Ensure all dependencies in requirements.txt are installed.

Confirm that raw data files exist in /data/raw/ before fetching or preprocessing.

Check that data/processed/ contains necessary intermediate files before training models.

Contributing

Contributions are welcome! Please feel free to write to me to open an issue and discuss your ideas.

Author

Stefan Pilegaard Pedersen May 2025

License

This project is licensed under the GNU General Public License v3.0.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github/workflows		.github/workflows
data		data
notebook		notebook
pipeline		pipeline
reports		reports
scripts		scripts
theory		theory
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
main_cli.py		main_cli.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Freight Forecasting Pipeline

Overview

Table of Contents

Project Structure

Installation

Prerequisites

Setup

bash

Usage

CLI Interface

Command-line Arguments

Pipeline Stages

Output

Troubleshooting

Contributing

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Stef-creator/Freight-Prediction

Folders and files

Latest commit

History

Repository files navigation

Freight Forecasting Pipeline

Overview

Table of Contents

Project Structure

Installation

Prerequisites

Setup

bash

Usage

CLI Interface

Command-line Arguments

Pipeline Stages

Output

Troubleshooting

Contributing

Author

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages