The research awesome list can be found here
List of curated machine learning frameworks and tools, inspired by
awesome-machine-learning
.
Contributions welcome! Read the contribution guidelines first.
- Common-Voice - Multi language, open source database with voice samples that anyone can use to train speech-enabled applications.
- AI-Blocks - a powerful and intuitive WYSIWYG interface that allows anyone to create Machine Learning models.
- Luna Studio - Hybrid textual and visual functional programming
- fast.ai - The fastai library simplifies training fast and accurate neural nets using modern best practices
- PyTorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
- tensorflow - An Open Source Machine Learning Framework for Everyone by Google
- neon - Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
- cleverhans - An adversarial example library for constructing attacks, building defenses, and benchmarking both
- Netron - a viewer for neural network, deep learning and machine learning models.
- Online viewer -
- List of conversion tools for DNN models - list of many libraries (github projects) that provides options for converting DNN models between different frameworks
- umap - dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction
- DALI - A library containing both highly optimized building blocks and an execution engine for data pre-processing in deep learning applications docs
- gin-config - Gin provides a lightweight configuration framework for Python, by Google.
- imbalanced-learn - A python package offering a number of re-sampling techniques. Compatible with scikit-learn, is part of scikit-learn-contrib projects.
- mlxtend - A library of extension and helper modules for Python's data analysis and machine learning libraries.
- numpy - The fundamental package for scientific computing with Python.
- PyOD - Outlier detection library
- RAPIDS - Open GPU Data Science. More here or in cheatsheet
- scikit-learn - machine learning in Python
- scikit-learn-laboratory (SKLL) - CLI for sklearn, working with configuration files
- scipy - open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.
- statsmodels - statistical modeling and econometrics in Python: time-series analysis, survival analysis, discrete models, Generalized Linear Models
- SymPy - A computer algebra system written in pure Python, library for symbolic mathematics
- Vaex - Out-of-Core DataFrames for Python, visualize and explore big tabular data at a billion rows per second. Project page
- Allen NLP - An open-source NLP research library, built on PyTorch.
- PyText - A natural language modeling framework based on PyTorch by Facebook Research
- pytorch-transformers - A library of state-of-the-art pretrained models for (NLP) including BERT, GPT, GPT-2, Transformer-XL, XLNet and XLM with multiple pre-trained model weights
- flair - A very simple framework for state-of-the-art Natural Language Processing (NLP) by Zalando Research
- gensim - Topic modeling for humans. Enables analysis of plain-text documents for semantic structure. Compatible with Word2Vec, FastText and other NLP models.
- spaCy - spaCy is a library for advanced Natural Language Processing in Python and Cython. spaCy comes with pretrained statistical models and word vectors, and supports tokenization for 50+ languages.
- TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
- hyperopt - Distributed Asynchronous Hyperparameter Optimization in Python
- nevergrad - A Python toolbox for performing gradient-free optimization by Facebook Research
- pyro - Deep universal probabilistic programming with Python and PyTorch by Uber
- pgmpy - Python Library for Probabilistic Graphical Models
- surprise - A Python scikit for building and analyzing recommender systems
- warp-ctc - loss function to train on misaligned data and labels by Baidu Research
- DeepSpeech - A TensorFlow implementation of Baidu's DeepSpeech architecture
- speech-to-text-wavenet - End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
- pykaldi - A Python wrapper for Kaldi - a toolkit for speech recognition
- pytorch-kaldi - pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
- gradio - Library to easily integrate models into existing python (web) apps.
- glow - Compiler for Neural Network hardware accelerators by PyTorch
- jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more by Google
- numba - NumPy aware dynamic Python compiler using LLVM
- csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats.
- redash - Connect to any data source, easily visualize, dashboard and share your data.
- odo - Odo migrates between many formats. These include in-memory structures like list, pd.DataFrame and np.ndarray and also data outside of Python like CSV/JSON/HDF5 files, SQL databases, data on remote machines, and the Hadoop File System.
- doccano - Open source text annotation tool for machine learning practitioners.
- snorkel - A system for quickly generating training data with weak supervision
- scrapy - high-level library to write crawlers and spiders.
- Quilt - Quilt versions and deploys data
- matplotlib - plotting with Python
- bokeh - Interactive Web Plotting for Python
- plotly - An open-source, interactive graphing library for Python
- dash - Analytical Web Apps for Python. No JavaScript Required.
- Jupyter Dashboards - Jupyter layout extension
- vega - visualization grammar, a declarative format for creating, saving, and sharing interactive visualization designs. With Vega you can describe data visualizations in a JSON format, and generate interactive views using either HTML5 Canvas or SVG.
- schema crawler - a tool to visualize database schema
- scikit-plot - sklearn wrapper to automate frequently used machine learning visualizations.
- featuretools - an open source python framework for automated feature engineering.
-
nvtop - a (h)top like task monitor for NVIDIA GPUs. It can handle multiple GPUs and print information about them in a htop familiar way.
-
s2i - Source-to-Image (S2I) is a toolkit and workflow for building reproducible container images from source code. S2I produces ready-to-run images by injecting source code into a container image and letting the container prepare that source code for execution. By creating self-assembling builder images, you can version and control your build environments exactly like you use container images to version your runtime environments.
- luigi - Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in. By Spotify.
- pywren - parfor on AWS Lambda
- horovod - Distributed training framework for TensorFlow, Keras, PyTorch, and MXNet by Uber.
- dask - library for parallel computing in Python with dynamic task scheduling: numpy computation graphs.
- shap - A unified approach to explain the output of any machine learning model
- tensorboardX - tensorboard for pytorch (and chainer, mxnet, numpy, ...)
- Weights and Biases - Experiment Tracking for Deep Learning
- pandas-profiling - tool for generating exploratory data analysis for the provided DataFrame - presenting results in the form of HTML report
To the extent possible under law, Netguru has waived all copyright and related or neighboring rights to this work. See LICENSE.