Skip to content

Production ML pipeline that monitors models for drift (PSI, KL, KS tests), automatically triggers retraining when performance degrades, and deploys validated models with zero downtime.

Notifications You must be signed in to change notification settings

eharshit/ml-guard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Why I'm Building This

Most ML models decay over time.
Data changes. Relationships shift. Performance drops silently.

That's something I kept noticing while going through ML tutorials. Almost all of them end at the same point: "My model hits 95% accuracy!" And then… nothing.

Nobody talks about what happens three months later when user behavior changes, inputs drift, and that same model quietly slides to 70% accuracy while still running in production.

That gap bothered me.

So I'm building a system that monitors models 24/7, catches problems early, and fixes itself without manual intervention.

What It Does

Monitors incoming data for distribution changes
Detects when model performance drops
Retrains automatically on fresh data
Validates new models before deployment
Deploys gradually with zero downtime


Tech Stack

Category Technology Purpose
Training & Experiments Scikit-learn XGBoost model training
Training & Experiments MLflow experiment tracking, model versioning
Automation Airflow schedules drift checks and retraining
Automation Docker containerization
Serving FastAPI BentoML prediction API
Monitoring Grafana dashboards
Monitoring Slack alerts
Deployment GitHub Actions CI/CD

About

Production ML pipeline that monitors models for drift (PSI, KL, KS tests), automatically triggers retraining when performance degrades, and deploys validated models with zero downtime.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published