Skip to content

Visual transformer model in PyTorch, serving as an example for how to run AI applications on LUMI.

License

Notifications You must be signed in to change notification settings

Lumi-supercomputer/LUMI-AI-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LUMI-AI-example

This project is still work in progress and changes are made constatnly. For well tested examples have a look at the LUMI AI workshop material: https://github.com/Lumi-supercomputer/Getting_Started_with_AI_workshop

Visual transformer model in PyTorch, serving as an example of how to run AI applications on LUMI.

We use the torchvision vit_b_16 model and train it with the tiny-imagenet dataset. This project is meant to provide a sandbox for testing and benchmarking AI applications on LUMI and should eventually serve as an A-Z example as part of the LUMI AI documentation. Use bigger models and larger dataset if required

HDF5 support

The imagenet dataset consists of hundreds of thousands of single jpg files. To avoid the "many small files" problem the datasets can be transformed into a single HDF5 file with the script turn_into_hdf5.py. Note, that this increases the size of the data by one order of magnitude as this script does not compress the data in any form.

Running script on LUMI

This github repo is cloned to /project/project_462000002/LUMI-AI-example. Training data, validation data, and the parameters of the model are in the same directory. The used container is extended via a virtual environment, as described here, since h5py is not included in the container. The training and validation datasets are also uploaded to the lumi-o:imagenet/ bucket. Anyone is welcome to work in that directory in order to minimize data storage, but please create a new branch.

Building website

Install the needed dependencies.

pip install -r requirements.txt

Edit with live preview

run

mkdocs serve

This command will start a live-reloading local web server that can be accessed in a web browser via: http://127.0.0.1:8000. The local web serve will automatically re-render and reload the site when you edit the documentation.

Generate the static site

To build a self-contained directory containing the full website run:

mkdocs build

The generated files will be located in the site/ directory.

About

Visual transformer model in PyTorch, serving as an example for how to run AI applications on LUMI.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published