Releases: pgagarinov/pytorch-hyperlight
Standard release
Added
- a jupyter notebook example that compares Vision Transformer and EfficientNet for image classification (facial images, gender prediction)
- NSTImageOrPathDataLoader capable of downloading images from 3 different sources transparently: s3, http and local filesystem
- plain_simple_nst.ipynb example capable of downloading both content and style images from either s3 or via http or from local paths and uploading the styled image back to s3/local path. It also works with local paths
- a few auxilary functions that makes it possible with download images from s3/http/local path and upload them to s3/local path
- image_loading_sync_sizes.ipynb example demonstraing the image loading with automatic cropping and resizing for Neural Style Transfer
- boto3, validators, pytorch, torchvision dependencies for pytorch_hyperlight
- boto3 dependency for jupyter-mldev
- missing license headers to the newly added files
- execute permissions for the shell scripts for building and uploading the pip package
- the proper CHANGELOG.md
Changed
- multy_style_nst.ipynb is made even more compact by using the higher-level NST DataLoader capable or downloading images directly from both http and s3 urls
Fixed
- Fixed: style images are not resized correctly when they are larger than content images
Reusable Neural style transfer is made an integral part of the framework
Overview
The major changes are:
PyTorch-Hyperlight
- re-usable Neural Style Transfer code is moved to nst package,
- added a few new auxilary functions for loading and loading images, downloading images (as well as arbitrary objects) from a list of urls
Details
PyTorch-Hyperlight
Added
- a new higher-level image file data loader class for NST called "NSTImageFileDataLoader" is placed into tasks.nst.py package.
- new utility functions for loading and showing image tensors in utils.image_utils package and download_urls function in utils.request_utils.py
- requests and pillow are now dependencies for both pytorch_hyperlight
- n_steps_per_epoch parameter in hparams calculated by Runner class and provided to PyTorch Lightning modules automatically
Changed:
- multi_style_nst.ipynb is made much more lightweight as it now uses the classes from nst.py package.
- pytorch-lightning minimal version is set to 1.1.5 in requirements.txt (as in this version introduced the important fixes to the progress)
JupyterLab-based MLDev environment
Added
- new dependencies: requests, pillow, kaggle cli and pytorch-pretrained-vit
Metrics plotting is integrated into the training/validation/testing progress bar, new Multi-style NST example
Overview
The major changes are:
PyTorch-Hyperlight
- Progress bar displayed by Runner now features an integrated metrics plotting which makes it unnecessary to call
show_report()
by hand once the training/validation loop is finished - Text logs with current metrics values are only displayed when PyTorch-Hyperlight is used from console. Within the JupyterLab the metrics log messages are replaced with the dynamic plotting integrated with the progress bar
- Added a multi-style neural style transfer example
Details
PyTorch-Hyperlight
Added
- Progress bar displayed by Runner now features an integrated metrics plotting which makes it unnecessary to call
show_report()
by hand once the training/validation loop is finished - Added a multi-style neural style transfer example
- Added a notebook that runs all the examples. Useful for automating re-running the examples after the major changes.
Changed:
- Text logs with current metrics values are only displayed when PyTorch-Hyperlight is used from console. Within the JupyterLab the metrics log messages are replaced with the dynamic plotting integrated with the progress bar
- The happy path test for the example notebooks now scans the example folder and runs all the notebooks it finds (except for those with name that starts with "_" ).
Fixed
- Training and validation metrics were displayed even on test and revalidation stages inside the dataframes, sometimes this breaks the plotting as well.
JupyterLab-based MLDev environment
Fixed
- Some of the scripts contain hard-coded paths
- The post-installation checks are not robust enough
Comparison reports integrated into Runner, re-usable classification task, the new MLDev environment tool, switch to Jupyter 3.0.1
Overview
The major changes are:
PyTorch-Hyperlight
- Runner provides
get_metrics
andshow_metric_report
methods showing both time-series and last observed metrics for each run. - Reusable
AClassificationTask
PyTorch-Lightning module to be used as a base class for classification tasks. The only abstract method that is not defined inAClassificationTask
isconfigure_optimizers
- More specific
ClassificationTaskAdamStepLR
(Adap optimizer + StepLR scheduler) andClassificationTaskAdamWWarmup
(AdamW optimizer + LinearWarmup scheduler from transformers library) tasks. - Semantic segmentation and model comparison examples.
JupyterLab-based MLDev environment
- Switch to PyTorch 1.7.1 and CUDA 11.
- Switch to JupyterLab 3.0.1.
- The new
envtool
CLI tool for managing the environment.
Details
PyTorch-Hyperlight
Added
-
Runner
class accumulates all time-series metrics from all subsequent calls ofrun_single_trial
andrun_hyperopt
. The metrics be accessed at any point viaget_metrics
method or displayed viashow_metric_report
method. MNIST model comparison noteboook provides the usage examples. The old behavior without any metrics collection is available is still available viaBaseRunner
class. The names of runs (during metrics accumulation) are either generated automatically (based on class names of PyTorch-Lightning modules) or can be specified either explicitly or partially (via suffixes). See the notebook example above for details. -
AClassificationTask
class is designed in a flexible and re-usable way to serve as a base for PyTorch lightning modules for classification problems. By defaultAClassificationTask
calculates a set of classification metrics (f1, accuracy, precision and recall) but can be extended by inheriting from if needed. The only abstract method that is not defined inAClassificationTask
isconfigure_optimizers
. -
More specific
ClassificationTaskAdamStepLR
(Adap optimizer + StepLR scheduler) andClassificationTaskAdamWWarmup
(AdamW optimizer + LinearWarmup scheduler from transformers library) tasks. -
Dependency on transformers library.
-
Tests for the new notebook examples.
-
Tests for the grace period and the subsequent runs for the same PyTorch-Lightning modules.
-
'stage-list' field in metrics dataframes (TrialMetrics.df for instance).
-
integrated self-checks in
run_hyperopt
andrun_single_trial
that make sure that the metrics returned by PyTorch-Lightning Trainer's methodsfit
andtest
are the same as those accumulated via callbacks by PyTorch-Hyperlight.
Changed:
- run_hyperopt method now returns TrialMetrics similarly to run_single_trial method.
- TrialMetrics class is refactored to accept only a single Pandas DataFrame with time series metrics in the constructor.
- Dependency on PyTorch-Lightning version is relaxed (
1.1.*
now) - Jupyter notebook examples are better documented and cleaner now.
Fixed
- 'grace_period' parameter in CONFIG is not recognized by
run_single_trial
method - Paths to MNIST dataset are hardcoded in all Jupyter notebook examples
JupyterLab-based MLDev environment
- Switch to PyTorch 1.7.1 and CUDA 11.
- Switch to JupyterLab 3.0.1.
- The new
envtool
CLI tool for managing the environment. Usage examples:# strip conda package versions for all packages but pytorch and torchvision, use /mldevenv_conda_requirements.yml as a source # and put the # resulting file to /out.yml mlenvtool conda_env_yaml_transform versions_strip ./mldevenv_conda_requirements.yml ./out.yml --except_package_list 'pytorch' 'torchvision' # replace `==` with `>=` for all packages except for `pytorch` and `torchvision` mlenvtool conda_env_yaml_transform versions_eq2ge ./mldevenv_conda_requirements.yml ./out.yml --except_package_list 'pytorch' 'torchvision' # update all conda packages and JupyterLab extensions in the current conda environment mlenvtool conda_env_update all # see help mlenvtool -h # see help for `conda_env_update` command mlenvtool conda_env_update -h
./install_all.sh
script now usesmlenvtool
and prints cleaner status messages.- Almost all Python dependencies have been updated.
0.1.3
- "ptl_trainer_grace_period parameter" has been merged with "grace_period" parameter which is now responsible for both Ray Tune Scheduler's early stopping patience and PyTorch Lightning Trainer's early stopping patience
- PyTorch Lightning version has been bumped up to 1.1.2
Improvements to plotting capabilities of TrialMetrics class
TrialMetrics.plot method now uses different styles for train and not train stage metrics. Different groups of metrics graphs use different markers.
Minor fixes
Fixed: Ray Tune is missing in requirements.txt
Fixed: minor cosmetic issued in comments and README.md
The very first release
PyTorch Hyperlight key principles
- No wheel reinvention Parts of PyTorch Lightning or Ray Tune that already provide simple enough interfaces are used as is. PyTorch Hyperlight just makes use of those frameworks easier by minimizing an amount of boilerplate code.
- Opinionated approach that doesn't try to be flexible for every possible task. Instead PyTorch Hyperlight tries to address fewer usecase better by providing pre-configured integrations and functionaly out of the box.
- Minimalistic user-facing API allows to do research by using a single
Runner
class that is similar to PyTorch Lightning'sTrainer
but is a higher-level abstraction. - Expect both model and data as definitions, not as data. All this is done to minimize problems with Ray Tune which runs trails in separate processes. For
- training/validation data this means that Hyperlight API expects a user to provide a function that returns DataLoaders, not ready-to-use DataLoader. Of course you can attach data to your functions with
functools.partion
but this is not recommended. - model it means that Hyperlight API (namely
Runner
's methods) expects a user to provide a class defining a model, not the model itself.
- training/validation data this means that Hyperlight API expects a user to provide a function that returns DataLoaders, not ready-to-use DataLoader. Of course you can attach data to your functions with
Features
- All hyper-parameters are defined in a single dictionary.
- Integrated plotting of training, validation and testing stage metrics.
- Pre-configured integration with Ray Tune for ASHA scheduler and HyperOpt optimization algorithm for out of the box hyper-parameter tuning.
- Logging the training progress on console (via tabule library)
- Pre-configured integration with WandB that works for both single runs and hyper-parameter optimization runs (via Ray Tune)
Assumptions
As most of opinionated frameworks PyTorch Hyperlight makes few assumptions about the way you organize your code:
-
You are familiar with PyTorch-Lightning, if not - refer to PyTorch Lightning awesome documentation.
-
Runner
-
Metrics that you log from your PyTorch Lightning module should have pre-defined prefixes and suffixes:
- "val", "train" or "test" ("val_f1_step" for example) as a prefix
- "epoch" or "step" ("train_f1_epoch" for example) as a suffix
-
DataLoaders should be returned by a function as a dictionary. The function should have "batch_size" as a regular parameter and "n_workers" as a key word parameter. They reason PyTorch Hyperlight doesn't rely on LightningDataModule from PyTorch Lightning is LightningDataModule might contains some data that would have to be serialized in the master process and de-serialized in each Ray Tune worker (workers are responsible for running hyper-parameter search trials).
-
WandB API key should be in the plain text file
~/.wandb_api_key