MatsuLM is a simple Neural Network Language Modeling (NNLM) toolkit to help the research and development of neural language models. MatsuLM is build on top of PyTorch and it is offering simplified tools to modify, train, and track NNLM training.
This tool was made for the Department of Signal Processing and Acoustics at Aalto University as a Master's Thesis. The Master's Thesis can be found in here: MatsuLM
The tracking of the language model training results has been done with Sacred and the recommended tool for representing results is OmniBoard.
$ ./get_data.sh
$ pip3 install -r requirements.txt
$ python3 main.py
Run and view results from local Omniboard (demo)
- Install and run Docker in your machine
- Run the following commands:
$ ./get_data.sh
$ make local_sacred_docker
$ python3 main.py --sacred_mongo "docker"
- Track the training results in http://localhost:9000/sacred
Run and save/view results from remote Omniboard (demo)
For a long term training and developing (for example in a research project) I would suggest on creating a database to MongoDB Atlas. It is free (for this amount of data), easy to set up, and makes it convenient to train models in multiple different machines while saving all the training results in one place.
When saving the training results in the cloud, I would also recommend running the Omniboard remotely as a website like this: https://ai.riko.io/.
Here you can find the instructions on:
Then just run on any machine:
$ ./get_data.sh
$ pip3 install -r requirements.txt
$ python3 main.py --sacred_mongo "mongodb://<username>:<password>@<host>/<database>"
- Run
./get_data.sh
to acquire the Penn Treebank and WikiText-2 (paper) datasets - Run
$ python3 main.py
to just train the base model or run$ python3 main.py --sacred_mongo "docker"
to save the training results to Sacred - (Optional) Run
$ make local_sacred_docker
to create 2 Docker containers. One contains a MongoDB where Sacred can save the training results and the other contains an UI (called Omniboard) that serves Sacred's data in http://localhost:9000/sacred
- Add your own parameters to
main.py
- Add your own training data by creating a folder with test.txt, train.txt, and valid.txt files and the folder's path as a parameter. For example,
$ python3 main.py --data "data/example/"
- Add your own hyperparameter search by listing parameters to
parameters
dictionary inmain.py
Sacred - tracking the training results
Demo: https://ai.riko.io/
Every experiment is sacred
Every experiment is great
If an experiment is wasted
God gets quite irate
I strongly suggest to create a MongoDB Atlas account for saving the training results with Sacred. MongoDB Atlas is free (for this amount of data), easy to set up, and makes it convenient to train models in multiple different machines while saving all the experiment results in one place.
The easiest and most practical way to use Sacred for saving the training results is to save the results in MongoDB Atlas. Here is the instructions how that can be setup.
- Create a free mongo database in MongoDB Atlas and get the MongoDB connection URI from there.
- The MongoDB connection URI can be then used the save the training results with Sacred by adding a the following
--sacred_mongo
parameter:
$ python3 main.py --sacred_mongo "mongodb://<username>:<password>@<host>/<database>"
- When you want to see these training results, you can start the Omniboard. See the inctructions for a quick start.
When results are in MongoDB Atlas and you want to make your life even more easier and access the training results anytime and anywhere, you can run your Omniboard on a server as a website. Here are the instructions for doing that:
First, you have to create an Nginx-proxy to your server for controlling the traffic (simple instruction here). After you have gone through the instructions and added docker-compose-server.yml
(USERNAME, PASSWORD, and MONGO_URL), you can just download this repo to your server and run the following command in the folder:
$ make server
Then just enjoy you training results anywhere :D
Ps. If you Omniboard website does not work right away, it might take a few minutes to generate the https certificate, so be patient.
This repository contains the code used for MatsuLM master's thesis. If you use this code or results in your research, please cite as appropriate:
@article{nybergMatsulm,
title={{MatsuLM neural language modeling toolkit}},
author={Nyberg, Riko},
year={2020}
}