Skip to content

Latest commit

 

History

History
24 lines (22 loc) · 5.31 KB

README_Model and Data Versioning.md

File metadata and controls

24 lines (22 loc) · 5.31 KB

Model and Data Versioning

  • Apache Marvin is a platform for model deployment and versioning that hides all complexity under the hood: data scientists just need to set up the server and write their code in an extended jupyter notebook.
  • Catalyst - High-level utils for PyTorch DL & RL research. It was developed with a focus on reproducibility, fast experimentation and code/ideas reusing.
  • D6tflow - A python library that allows for building complex data science workflows on Python.
  • DAGsHub - The home for data science collaboration. A platform, based on DVC, for data science project management and collaboration.
  • Data Version Control (DVC) - A git fork that allows for version management of models.
  • FGLab - Machine learning dashboard, designed to make prototyping experiments easier.
  • Flor - Easy to use logger and automatic version controller made for data scientists who write ML code
  • Hangar - Version control for tensor data, git-like semantics on numerical data with high speed and efficiency.
  • Kedro - Kedro is a workflow development tool that helps you build data pipelines that are robust, scalable, deployable, reproducible and versioned. Visualization of the kedro workflows can be done by kedro-viz
  • MLflow - Open source platform to manage the ML lifecycle, including experimentation, reproducibility and deployment.
  • MLWatcher - MLWatcher is a python agent that records a large variety of time-serie metrics of your running ML classification algorithm. It enables you to monitor in real time.
  • ModelChimp - Framework to track and compare all the results and parameters from machine learning models (Video)
  • ModelDB - An open-source system to version machine learning models including their ingredients code, data, config, and environment and to track ML metadata across the model lifecycle.
  • Pachyderm - Open source distributed processing framework build on Kubernetes focused mainly on dynamic building of production machine learning pipelines - (Video)
  • Polyaxon - A platform for reproducible and scalable machine learning and deep learning on kubernetes. - (Video)
  • PredictionIO - An open source Machine Learning Server built on top of a state-of-the-art open source stack for developers and data scientists to create predictive engines for any machine learning task
  • Quilt Data - Versioning, reproducibility and deployment of data and models.
  • Sacred - Tool to help you configure, organize, log and reproduce machine learning experiments.
  • steppy - Lightweight, Python3 library for fast and reproducible machine learning experimentation. Introduces simple interface that enables clean machine learning pipeline design.
  • Studio.ML - Model management framework which minimizes the overhead involved with scheduling, running, monitoring and managing artifacts of your machine learning experiments.
  • TRAINS - Auto-Magical Experiment Manager & Version Control for AI.