Skip to content

Commit

Permalink
Merge branch 'main' into docker/cudnn
Browse files Browse the repository at this point in the history
  • Loading branch information
Borda authored Mar 26, 2024
2 parents fac798d + 79af2e3 commit 5bb2f09
Show file tree
Hide file tree
Showing 19 changed files with 273 additions and 1,685 deletions.
52 changes: 46 additions & 6 deletions .github/workflows/docs-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,51 @@ defaults:

jobs:
docs-make:
uses: Lightning-AI/utilities/.github/workflows/check-docs.yml@v0.11.0
with:
python-version: "3.10"
requirements-file: "requirements/docs.txt"
install-tex: true
if: github.event.pull_request.draft == false
runs-on: ubuntu-22.04
strategy:
fail-fast: false
matrix:
target: ["html", "doctest", "linkcheck"]
env:
ARTIFACT_DAYS: 0
PYPI_LOCAL_DIR: "pypi_pkgs/"
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"

- name: Pull sphinx template
run: |
pip install -q "awscli >=1.30.0"
aws s3 sync --no-sign-request s3://sphinx-packages/ ${PYPI_LOCAL_DIR}
pip install lai-sphinx-theme -U -f ${PYPI_LOCAL_DIR}
- name: Install pandoc
timeout-minutes: 5
run: |
sudo apt-get update --fix-missing
sudo apt-get install -y pandoc
- name: Install package & dependencies
timeout-minutes: 20
run: pip install . -U -r requirements/docs.txt

- name: Make ${{ matrix.target }}
working-directory: docs/
# allow failing link check and doctest if you run with dispatch
continue-on-error: ${{ matrix.target == 'doctest' || matrix.target == 'linkcheck' }}
run: make ${{ matrix.target }} --debug --jobs $(nproc) SPHINXOPTS="-W --keep-going"

- name: Keep artifact
if: github.event_name == 'pull_request'
run: echo "ARTIFACT_DAYS=7" >> $GITHUB_ENV
- name: Upload built docs
if: ${{ matrix.target == 'html' }}
uses: actions/upload-artifact@v4
with:
name: docs-html-${{ github.sha }}
path: docs/build/html/
retention-days: ${{ env.ARTIFACT_DAYS }}

deploy-docs:
needs: docs-make
Expand All @@ -28,7 +68,7 @@ jobs:
env:
GCP_TARGET: "gs://lightning-docs-thunder"
steps:
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: docs-html-${{ github.sha }}
path: docs/build/
Expand Down
10 changes: 5 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ ci:

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
rev: v4.5.0
hooks:
- id: end-of-file-fixer
- id: trailing-whitespace
Expand All @@ -24,22 +24,22 @@ repos:
- id: detect-private-key

- repo: https://github.com/asottile/pyupgrade
rev: v3.11.1
rev: v3.15.2
hooks:
- id: pyupgrade
args: ["--py310-plus"]
name: Upgrade code
exclude: "examples|thunder/tests/test_interpreter.py|thunder/tests/test_jit_general.py"

- repo: https://github.com/codespell-project/codespell
rev: v2.2.5
rev: v2.2.6
hooks:
- id: codespell
additional_dependencies: [tomli]
#args: ["--write-changes"] # uncomment if you want to get automatic fixing

- repo: https://github.com/psf/black
rev: 23.9.1
rev: 24.3.0
hooks:
- id: black
name: Black code
Expand All @@ -61,7 +61,7 @@ repos:
- id: sphinx-lint

- repo: https://github.com/asottile/yesqa
rev: v1.4.0
rev: v1.5.0
hooks:
- id: yesqa

Expand Down
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,13 @@ test: clean
python -m coverage run --source thunder -m pytest thunder tests -v
python -m coverage report

docs: clean
sphinx-theme:
pip install -q awscli
mkdir -p dist/
aws s3 sync --no-sign-request s3://sphinx-packages/ dist/
pip install lai-sphinx-theme -f dist/

docs: clean sphinx-theme
pip install -e . --quiet -r requirements/docs.txt -f https://download.pytorch.org/whl/cpu/torch_stable.html
cd docs ; python -m sphinx -b html -W --keep-going source build

Expand Down
116 changes: 81 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@ ______________________________________________________________________
<a href="#get-started">Get started</a> •
<a href="#install-thunder">Install</a> •
<a href="#hello-world">Examples</a> •
<a href="#features">Features</a> •
<a href="#documentation">Documentation</a> •
<a href="#inside-thunder-a-brief-look-at-the-core-features">Inside Thunder</a> •
<a href="#get-involved">Get involved!</a> •
<a href="#documentation">Documentation</a>
</p>

[![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/Lightning-AI/lightning-thunder/blob/main/LICENSE)
Expand All @@ -30,41 +31,58 @@ ______________________________________________________________________

**Thunder makes PyTorch models Lightning fast.**

Thunder is a source-to-source compiler for PyTorch. It makes PyTorch programs faster by combining and using different hardware executors at once (ie: nvFuser, torch.compile, cuDNN, and TransformerEngine FP8).
Thunder is a source-to-source compiler for PyTorch. It makes PyTorch programs faster by combining and using different hardware executors at once (for instance, [nvFuser](https://github.com/NVIDIA/Fuser), [torch.compile](https://pytorch.org/docs/stable/torch.compiler.html), [cuDNN](https://developer.nvidia.com/cudnn), and [TransformerEngine FP8](https://github.com/NVIDIA/TransformerEngine)).

Works on single accelerators and in multi-GPU settings.
It supports both single and multi-GPU configurations.
Thunder aims to be usable, understandable, and extensible.

## Performance
&#160;

Thunder can achieve significant speedups over standard PyTorch eager code, through the compounding effects of optimizations and the use of best-in-class executors. Here is an example of the pretraining throughput for Llama 2 7B as implemented in [LitGPT](https://github.com/Lightning-AI/litgpt).
> \[!Note\]
> Lightning Thunder is in alpha. Feel free to get involved, but expect a few bumps along the way.
&#160;

## Single-GPU performance

Thunder can achieve significant speedups over standard non-compiled PyTorch code ("PyTorch eager"), through the compounding effects of optimizations and the use of best-in-class executors. The figure below shows the pretraining throughput for Llama 2 7B as implemented in [LitGPT](https://github.com/Lightning-AI/litgpt).

<div align="center">
<img alt="Thunder" src="docs/source/_static/images/training_throughput_single.png" width="800px" style="max-width: 100%;">
</div>

Thunder achieves a 40% speedup in training throughput compared to eager code on H100 using a combination of executors including nvFuser, torch.compile, cuDNN, and TransformerEngine FP8.
As shown in the plot above, Thunder achieves a 40% speedup in training throughput compared to eager code on H100 using a combination of executors including nvFuser, torch.compile, cuDNN, and TransformerEngine FP8.

&#160;

Thunder supports distributed strategies like DDP and FSDP (ZeRO2 and ZeRO3). Here is the normalized throughput measured for Llama 2 7B (this time without FP8 mixed precision, support for FSDP is underway).
## Multi-GPU performance

Thunder also supports distributed strategies such as DDP and FSDP for training models on multiple GPUs. The following plot displays the normalized throughput measured for Llama 2 7B without FP8 mixed precision; support for FSDP is in progress.

<div align="center">
<img alt="Thunder" src="docs/source/_static/images/normalized_training_throughput_zero2.png" width="800px" style="max-width: 100%;">
</div>

**NOTE: Lightning Thunder is alpha.** Feel free to get involved, expect a few bumps along the way.
&#160;

## Get started

Try Thunder without installing by using our [Zero to Thunder Tutorial Studio](https://lightning.ai/lightning-ai/studios/zero-to-thunder-tutorial).
The easiest way to get started with Thunder, requiring no extra installations or setups, is by using our [Zero to Thunder Tutorial Studio](https://lightning.ai/lightning-ai/studios/zero-to-thunder-tutorial).

&#160;

## Install Thunder

Install [nvFuser](https://github.com/NVIDIA/Fuser) nightly, and Thunder together
To use Thunder on your local machine, first install [nvFuser](https://github.com/NVIDIA/Fuser) nightly and PyTorch nightly together as follows:

```bash
# install nvFuser which installs the matching nightly PyTorch
pip install --pre 'nvfuser-cu121[torch]' --extra-index-url https://pypi.nvidia.com
```

Then, install Thunder as follows:

```
# install thunder
pip install lightning-thunder
```
Expand All @@ -73,26 +91,60 @@ pip install lightning-thunder
<summary>Advanced install options</summary>
<!-- following section will be skipped from PyPI description -->

&#160;

### Install from main

Alternatively, you can install the latest version of Thunder directly from this GitHub repository as follows:

```
# 1) Install nvFuser and PyTorch nightly dependencies:
pip install --pre 'nvfuser-cu121[torch]' --extra-index-url https://pypi.nvidia.com
```

```bash
# 2) Install Thunder itself
pip install git+https://github.com/Lightning-AI/lightning-thunder.git
```

&#160;

### Install to tinker and contribute

Install this way to tinker with the internals and contribute:
If you are interested in tinkering with and contributing to Thunder, we recommend cloning the Thunder repository and installing it in pip's editable mode:

```bash
git clone https://github.com/Lightning-AI/lightning-thunder.git
cd lightning-thunder
pip install -e .
```

&#160;

### Develop and run tests

After cloning the lightning-thunder repository and installing it as an editable package as explained above, ou can set up your environment for developing Thunder by installing the development requirements:

```bash
pip install -r requirements/devel.txt
```

Now you run tests:

```bash
pytest thunder/tests
```

Thunder is very thoroughly tested, so expect this to take a while.

</details>
<!-- end skipping PyPI description -->

&#160;

## Hello World

Here is a simple example of how Thunder lets you compile and run PyTorch code:
Below is a simple example of how Thunder allows you to compile and run PyTorch code:

```python
import torch
Expand Down Expand Up @@ -120,15 +172,19 @@ print(result)

The compiled function `jfoo` takes and returns PyTorch tensors, just like the original function, so modules and functions compiled by Thunder can be used as part of larger PyTorch programs.

&#160;

## Train models

Thunder is in its early stages and should not be used for production runs yet.

However, it can already deliver outstanding performance on LLM model supported by [LitGPT](https://github.com/Lightning-AI/lit-gpt), such as Mistral, Llama 2, Gemma, Falcon, and others.
However, it can already deliver outstanding performance for pretraining and finetuning LLMs supported by [LitGPT](https://github.com/Lightning-AI/lit-gpt), such as Mistral, Llama 2, Gemma, Falcon, and others.

Check out [the LitGPT integration](https://github.com/Lightning-AI/litgpt/tree/main/extensions/thunder) to learn about running LitGPT and Thunder together.

## Features
&#160;

## Inside Thunder: A brief look at the core features

Given a Python callable or PyTorch module, Thunder can generate an optimized program that:

Expand All @@ -140,13 +196,13 @@ Given a Python callable or PyTorch module, Thunder can generate an optimized pro
To do so, Thunder ships with:

- A JIT for acquiring Python programs targeting PyTorch and custom operations
- A multi-level IR to represent operations as a trace of a reduced op-set
- An extensible set of transformations on the trace, such as `grad`, fusions, distributed (like `ddp`, `fsdp`), functional (like `vmap`, `vjp`, `jvp`)
- A multi-level intermediate representation (IR) to represent operations as a trace of a reduced operation set
- An extensible set of transformations on the trace of a computational graph, such as `grad`, fusions, distributed (like `ddp`, `fsdp`), functional (like `vmap`, `vjp`, `jvp`)
- A way to dispatch operations to an extensible collection of executors

Thunder is written entirely in Python. Even its trace is represented as valid Python at all stages of transformation. This allows unprecedented levels of introspection and extensibility.

Thunder doesn't generate code for accelerators directly. It acquires and transforms user programs so that it's possible to optimally select or generate device code using fast executors like:
Thunder doesn't generate code for accelerators, such as GPUs, directly. It acquires and transforms user programs so that it's possible to optimally select or generate device code using fast executors like:

- [torch.compile](https://pytorch.org/get-started/pytorch-2.0/)
- [nvFuser](https://github.com/NVIDIA/Fuser)
Expand All @@ -159,6 +215,8 @@ Thunder doesn't generate code for accelerators directly. It acquires and transfo

Modules and functions compiled with Thunder fully interoperate with vanilla PyTorch and support PyTorch's autograd. Also, Thunder works alongside torch.compile to leverage its state-of-the-art optimizations.

&#160;

## Documentation

Docs are currently not hosted publicly. However you can build them locally really quickly:
Expand All @@ -169,27 +227,15 @@ make docs

and point your browser to the generated docs at `docs/build/index.html`.

## Develop and run tests

You can set up your environment for developing Thunder by installing the development requirements:

```bash
pip install -r requirements/devel.txt
```
&#160;

Install Thunder as an editable package (optional):

```bash
pip install -e .
```
## Get involved!

Now you run tests:
We appreciate your feedback and contributions. If you have feature requests, questions, or want to contribute code or config files, please don't hesitate to use the [GitHub Issue](https://github.com/Lightning-AI/lightning-thunder/issues) tracker.

```bash
pytest thunder/tests
```
We welcome all individual contributors, regardless of their level of experience or hardware. Your contributions are valuable, and we are excited to see what you can accomplish in this collaborative and supportive environment.

Thunder is very thoroughly tested, so expect this to take a while.
&#160;

## License

Expand Down
9 changes: 4 additions & 5 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,13 @@
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.

import glob
import inspect
import os
import re
import shutil
import sys
from importlib.util import module_from_spec, spec_from_file_location

import pt_lightning_sphinx_theme
import lai_sphinx_theme

_PATH_HERE = os.path.abspath(os.path.dirname(__file__))
_PATH_ROOT = os.path.realpath(os.path.join(_PATH_HERE, "..", ".."))
Expand Down Expand Up @@ -99,6 +97,7 @@ def _transform_changelog(path_in: str, path_out: str) -> None:
"sphinx_copybutton",
"sphinx_paramlinks",
"sphinx_togglebutton",
"lai_sphinx_theme.extensions.lightning",
]

# Add any paths that contain templates here, relative to this directory.
Expand Down Expand Up @@ -152,8 +151,8 @@ def _transform_changelog(path_in: str, path_out: str) -> None:
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "pt_lightning_sphinx_theme"
html_theme_path = [pt_lightning_sphinx_theme.get_html_theme_path()]
html_theme = "lai_sphinx_theme"
html_theme_path = [lai_sphinx_theme.get_html_theme_path()]

# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
Expand Down
Loading

0 comments on commit 5bb2f09

Please sign in to comment.