Skip to content

v0.0.1: First release

Latest
Compare
Choose a tag to compare
@gmihaila gmihaila released this 05 Feb 18:07
· 2 commits to master since this release

This is my first release of the Machine Learning Things package.

From this version onwards I will keep track of the updates.

Installation

This repo is tested with Python 3.6+.

It's always good practice to install ml_things in a virtual environment. If you guidance on using Python's virtual environments you can check out the user guide here.

You can install ml_things with pip from GitHub:

pip install git+https://github.com/gmihaila/ml_things

Current features:

Functions

All function implemented in the ml_things module.

  • Array Functions: Array manipulation related function that can be useful when working with machine learning.

    • pad_array: Pad variable length array to a fixed numpy array. It can handle single arrays [1,2,3] or nested arrays [[1,2],[3]].
    • batch_array: Split a list into batches/chunks. Last batch size is remaining of list values. Note: This is also called chunking. I call it batches since I use it more in ML.
  • Plot Functions: Plot related function that can be useful when working with machine learning.

    • plot_array: Create plot from a single array of valu
    • plot_dict: Create plot from a single array of values.
    • plot_confusion_matrix: This function prints and plots the confusion matrix.
  • Text Functions: Text related function that can be useful when working with machine learning.

    • clean_text: Clean text using various techniques.
  • Web Related: Web related function that can be useful when working with machine learning.

    • download_from: Download file from url. It will return the path of the downloaded file.

Snippets

This is a very large variety of Python snippets without a certain theme. I put them in the most frequently used ones while keeping a logical order.
I like to have them as simple and as efficient as possible.

Name Description
Read FIle One liner to read any file.
Write File One liner to write a string to a file.
Debug Start debugging after this line.
Pip Install GitHub Install library directly from GitHub using pip.
Parse Argument Parse arguments given when running a .py file.
Doctest How to run a simple unittesc using function documentaiton. Useful when need to do unittest inside notebook.
Fix Text Since text data is always messy, I always use it. It is great in fixing any bad Unicode.
Current Date How to get current date in Python. I use this when need to name log files.
Current Time Get current time in Python.
Remove Punctuation The fastest way to remove punctuation in Python3.
PyTorch-Dataset Code sample on how to create a PyTorch Dataset.
PyTorch-Device How to setup device in PyTorch to detect if GPU is available.

Notebooks Tutorials

This is where I keep notebooks of some previous projects which I turnned them into small tutorials. A lot of times I use them as basis for starting a new project.

All of the notebooks are in Google Colab. Never heard of Google Colab? 🙀 You have to check out the Overview of Colaboratory, Introduction to Colab and Python and what I think is a great medium article about it to configure Google Colab Like a Pro.

If you check the /ml_things/notebooks/ a lot of them are not listed here because they are not in a 'polished' form yet. These are the notebooks that are good enough to share with everyone:

Name Description Links
🍇 Better Batches with PyTorchText BucketIterator How to use PyTorchText BucketIterator to sort text data for better batching. Open In Colab Generic badge Generic badge Generic badge Generic badge
🐶 Pretrain Transformers Models in PyTorch using Hugging Face Transformers Pretrain 67 transformers models on your custom dataset. Open In Colab Generic badge Generic badge Generic badge Generic badge
🎻 Fine-tune Transformers in PyTorch using Hugging Face Transformers Complete tutorial on how to fine-tune 73 transformer models for text classification — no code changes necessary! Open In Colab Generic badge Generic badge Generic badge Generic badge
⚙️ Bert Inner Workings in PyTorch using Hugging Face Transformers Complete tutorial on how an input flows through Bert. Open In Colab Generic badge Generic badge Generic badge Generic badge
🎱 GPT2 For Text Classification using Hugging Face 🤗 Transformers Complete tutorial on how to use GPT2 for text classification. Open In Colab Generic badge Generic badge Generic badge Generic badge