MLGuard

MLGuard is a lightweight machine learning experiment management and data validation toolkit designed to help data scientists quickly validate datasets, detect potential data leakage, and benchmark multiple machine learning models.

It provides a simple interface to run experiments and compare models with minimal setup.

Why MLGuard?

When working with machine learning pipelines, developers often face problems such as:

Undetected data leakage
Repeated model experimentation scripts
Difficulty comparing models quickly
Poor experiment organization

MLGuard helps solve these problems by providing:

Automated data leakage detection
Automatic problem type detection
Built-in experiment manager
Simple model comparison tools

Features

Data Leakage Detection

Detects potential target leakage by analyzing correlations between features and the target variable.

Experiment Manager

Runs multiple machine learning models automatically and evaluates their performance.

Model Comparison

Compares model performance using appropriate evaluation metrics.

Automatic Problem Detection

Detects whether the task is:

Regression
Classification

Installation

Install from PyPI:

pip install mlguardlabs

Install locally for development:

pip install -e .

Quick Example

Regression Example

from sklearn.datasets import fetch_california_housing
from mlguard import ExperimentManager

data = fetch_california_housing(as_frame=True)

X = data.data
y = data.target

exp = ExperimentManager()

exp.fit(X, y)

print(exp.compare())

Example output:

Running leakage detection...
No leakage detected

Detected problem type: regression

[('rf', 0.50), ('ridge', 0.74)]

Classification Example

from sklearn.datasets import load_iris
from mlguard import ExperimentManager

data = load_iris(as_frame=True)

X = data.data
y = data.target

exp = ExperimentManager()

exp.fit(X, y)

print(exp.compare())

Data Leakage Detection Example

from mlguard import LeakageDetector

detector = LeakageDetector()

detector.fit(X, y)

print(detector.report())

This prints features that have unusually high correlation with the target.

How MLGuard Works

MLGuard runs a lightweight validation and experimentation pipeline:

Dataset
   ↓
Leakage Detection
   ↓
Problem Type Detection
   ↓
Model Experiments
   ↓
Model Evaluation
   ↓
Model Comparison

This allows users to quickly answer:

Which model works best for my dataset?

Project Structure

mlguard/
│
├── experiments
│   └── experiment_manager.py
│
├── inspection
│   └── leakage_detector.py
│
├── metrics
│   ├── regression_metrics.py
│   └── classification_metrics.py
│
├── utils
│   └── problem_type.py
│
├── examples
├── tests

Example Workflow

Typical machine learning workflow using MLGuard:

EDA
↓
Feature Engineering
↓
Feature Selection
↓
MLGuard Experiment Manager
↓
Best Model Selection
↓
Hyperparameter Tuning
↓
Final Model

MLGuard helps simplify the experimentation stage.

Roadmap

Future planned features include:

Cross-validation experiment engine
Dataset inspector
Automatic preprocessing pipelines
Feature validation tools
Experiment tracking

Contributing

Contributions are welcome!

Steps:

Fork the repository
Create a feature branch
Submit a pull request

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
examples		examples
mlguard		mlguard
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
testing.py		testing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLGuard

Why MLGuard?

Features

Data Leakage Detection

Experiment Manager

Model Comparison

Automatic Problem Detection

Installation

Quick Example

Regression Example

Classification Example

Data Leakage Detection Example

How MLGuard Works

Project Structure

Example Workflow

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MLGuard

Why MLGuard?

Features

Data Leakage Detection

Experiment Manager

Model Comparison

Automatic Problem Detection

Installation

Quick Example

Regression Example

Classification Example

Data Leakage Detection Example

How MLGuard Works

Project Structure

Example Workflow

Roadmap

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages