Release v0.0.1: First release · gmihaila/ml_things

This is my first release of the Machine Learning Things package.

From this version onwards I will keep track of the updates.

Installation

This repo is tested with Python 3.6+.

It's always good practice to install ml_things in a virtual environment. If you guidance on using Python's virtual environments you can check out the user guide here.

You can install ml_things with pip from GitHub:

pip install git+https://github.com/gmihaila/ml_things

Current features:

Functions

All function implemented in the ml_things module.

Array Functions: Array manipulation related function that can be useful when working with machine learning.
- pad_array: Pad variable length array to a fixed numpy array. It can handle single arrays [1,2,3] or nested arrays [[1,2],[3]].
- batch_array: Split a list into batches/chunks. Last batch size is remaining of list values. Note: This is also called chunking. I call it batches since I use it more in ML.
Plot Functions: Plot related function that can be useful when working with machine learning.
- plot_array: Create plot from a single array of valu
- plot_dict: Create plot from a single array of values.
- plot_confusion_matrix: This function prints and plots the confusion matrix.
Text Functions: Text related function that can be useful when working with machine learning.
- clean_text: Clean text using various techniques.
Web Related: Web related function that can be useful when working with machine learning.
- download_from: Download file from url. It will return the path of the downloaded file.

Snippets

This is a very large variety of Python snippets without a certain theme. I put them in the most frequently used ones while keeping a logical order.
I like to have them as simple and as efficient as possible.

Name	Description
Read FIle	One liner to read any file.
Write File	One liner to write a string to a file.
Debug	Start debugging after this line.
Pip Install GitHub	Install library directly from GitHub using `pip`.
Parse Argument	Parse arguments given when running a `.py` file.
Doctest	How to run a simple unittesc using function documentaiton. Useful when need to do unittest inside notebook.
Fix Text	Since text data is always messy, I always use it. It is great in fixing any bad Unicode.
Current Date	How to get current date in Python. I use this when need to name log files.
Current Time	Get current time in Python.
Remove Punctuation	The fastest way to remove punctuation in Python3.
PyTorch-Dataset	Code sample on how to create a PyTorch Dataset.
PyTorch-Device	How to setup device in PyTorch to detect if GPU is available.

Notebooks Tutorials

This is where I keep notebooks of some previous projects which I turnned them into small tutorials. A lot of times I use them as basis for starting a new project.

All of the notebooks are in Google Colab. Never heard of Google Colab? 🙀 You have to check out the Overview of Colaboratory, Introduction to Colab and Python and what I think is a great medium article about it to configure Google Colab Like a Pro.

If you check the /ml_things/notebooks/ a lot of them are not listed here because they are not in a 'polished' form yet. These are the notebooks that are good enough to share with everyone:

Name	Description	Links
🍇 Better Batches with PyTorchText BucketIterator	How to use PyTorchText BucketIterator to sort text data for better batching.
🐶 Pretrain Transformers Models in PyTorch using Hugging Face Transformers	Pretrain 67 transformers models on your custom dataset.
🎻 Fine-tune Transformers in PyTorch using Hugging Face Transformers	Complete tutorial on how to fine-tune 73 transformer models for text classification — no code changes necessary!
⚙️ Bert Inner Workings in PyTorch using Hugging Face Transformers	Complete tutorial on how an input flows through Bert.
🎱 GPT2 For Text Classification using Hugging Face 🤗 Transformers	Complete tutorial on how to use GPT2 for text classification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.1: First release

Installation

Current features:

Functions

Snippets

Notebooks Tutorials