Skip to content

TransformerEngine - Intermediate tensor sharding #1025

TransformerEngine - Intermediate tensor sharding

TransformerEngine - Intermediate tensor sharding #1025

Triggered via pull request July 5, 2024 10:23
Status Success
Total duration 22s
Artifacts

auto-cc.yml

on: pull_request
Fit to window
Zoom out
Zoom in

Annotations

2 errors and 1 warning
auto-cc
Resource not accessible by integration { name: 'HttpError', id: '9806863981', status: 403, response: { url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695', status: 403, headers: { 'access-control-allow-origin': '*', 'access-control-expose-headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset', connection: 'close', 'content-encoding': 'gzip', 'content-security-policy': "default-src 'none'", 'content-type': 'application/json; charset=utf-8', date: 'Fri, 05 Jul 2024 10:24:17 GMT', 'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', server: 'istio-envoy', 'strict-transport-security': 'max-age=31536000; includeSubdomains; preload', 'transfer-encoding': 'chunked', vary: 'Accept-Encoding, Accept, X-Requested-With', 'x-accepted-github-permissions': 'pull_requests=write', 'x-content-type-options': 'nosniff', 'x-envoy-decorator-operation': 'unicorn-api.github-production.svc.cluster.local:80/*', 'x-envoy-upstream-service-time': '82', 'x-frame-options': 'deny', 'x-github-api-version-selected': '2022-11-28', 'x-github-media-type': 'github.v3; format=json', 'x-github-request-id': '3003:BC5FA:3579A2D:61542EB:6687C9D0', 'x-ratelimit-limit': '15000', 'x-ratelimit-remaining': '14992', 'x-ratelimit-reset': '1720176418', 'x-ratelimit-resource': 'core', 'x-ratelimit-used': '8', 'x-xss-protection': '0' }, data: { message: 'Resource not accessible by integration', documentation_url: 'https://docs.github.com/rest/pulls/pulls#update-a-pull-request', status: '403' } }, request: { method: 'PATCH', url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695', headers: { accept: 'application/vnd.github.v3+json', 'user-agent': 'probot/12.2.5 octokit-core.js/3.6.0 Node.js/20.13.1 (linux; x64)', authorization: 'token [REDACTED]', 'content-type': 'application/json; charset=utf-8' }, body: '{"body":"TransformerEngine added the ability to shard intermediate activation tensors in v1.8. Currently, we save global/world sized activation for backward pass. Using this, we can lower the peak memory usage at the cost of added comms - as we will shard these intermediate tensor and gather them before the backward computation.\\r\\n\\r\\nTE PR: https://github.com/NVIDIA/TransformerEngine/pull/687\\r\\n\\r\\nIn this PR, we use make this option opt-in using thunder.jit compile argument - `fp8_shard_intermediate_activation`.\\r\\n\\r\\nExample usage: `model = thunder.jit(model, executors=executors, fp8_shard_intermediate_activation=True)`\\r\\n\\r\\n**Testing**\\r\\n\\r\\nUpdated the distributed test to use this option. Have tested with existing tests in `test_transformer_engine_executor.py` and `test_ddp.py -k transformer` with TE v1.7 (current stable), v1.8 and v1.9 (current main).\\r\\n\\r\\n**Benchmark**\\r\\nCommand - \\r\\n```\\r\\ntorchrun --nproc_per_node=8 --nnodes=1 thunder/benchmarks/benchmark_litgpt.py --return_metrics_as_json=True --json_path=/tmp/benchmark_litgpt_data.json --distributed_mode=fsdp --shard_mode=zero3 --model_name=Llama-2-7b-hf --micro_batch_size=1 --compile=thunder_inductor_cat_transformerengine_cudnn --nsys_enabled=False --dynamic=False\\r\\n```\\r\\n\\r\\nWithout FP8 Intermediate Sharding \\r\\n```\\r\\nAverage iter time: 282.47 ms\\r\\nMemory used: 52.92 GB\\r\\n```\\r\\n\\r\\n\\r\\nWith FP8 Intermediate Sharding \\r\\n```\\r\\nAverage iter time: 341.67 ms\\r\\nMemory used: 44.05 GB\\r\\n```\\r\\n\\r\\n<details>\\r\\n\\r\\n<summary> Patch to enable sharding in `benchmark_litgpt.py` </summary>\\r\\n\\r\\n```patch\\r\\ndiff --git a/thunder/benchmarks/benchmark_litgpt.py b/thunder/benchm
auto-cc
HttpError: Resource not accessible by integration at /home/runner/work/_actions/Lightning-AI/probot/v5/node_modules/@octokit/core/node_modules/@octokit/request/dist-node/index.js:86:21 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async Job.doExecute (/home/runner/work/_actions/Lightning-AI/probot/v5/node_modules/bottleneck/light.js:405:18) { name: 'AggregateError', event: { id: '9806863981', name: 'pull_request', payload: { action: 'labeled', label: { color: '3855E2', default: false, description: '', id: 6781712626, name: 'distributed', node_id: 'LA_kwDOLiCyD88AAAABlDi48g', url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/labels/distributed' }, number: 695, organization: { avatar_url: 'https://avatars.githubusercontent.com/u/58386951?v=4', description: 'Turn ideas into AI, Lightning fast. Creators of PyTorch Lightning, Lightning AI Studio, TorchMetrics, Fabric, Lit-GPT, Lit-LLaMA', events_url: 'https://api.github.com/orgs/Lightning-AI/events', hooks_url: 'https://api.github.com/orgs/Lightning-AI/hooks', id: 58386951, issues_url: 'https://api.github.com/orgs/Lightning-AI/issues', login: 'Lightning-AI', members_url: 'https://api.github.com/orgs/Lightning-AI/members{/member}', node_id: 'MDEyOk9yZ2FuaXphdGlvbjU4Mzg2OTUx', public_members_url: 'https://api.github.com/orgs/Lightning-AI/public_members{/member}', repos_url: 'https://api.github.com/orgs/Lightning-AI/repos', url: 'https://api.github.com/orgs/Lightning-AI' }, pull_request: { _links: { comments: { href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/issues/695/comments' }, commits: { href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695/commits' }, html: { href: 'https://github.com/Lightning-AI/lightning-thunder/pull/695' }, issue: { href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/issues/695' }, review_comment: { href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/comments{/number}' }, review_comments: { href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695/comments' }, self: { href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695' }, statuses: { href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/statuses/b062aff346025ebb9e2fb313c6f55eed00abedb2' } }, active_lock_reason: null, additions: 44, assignee: null, assignees: [], author_association: 'COLLABORATOR', auto_merge: null, base: { label: 'Lightning-AI:main', ref: 'main', repo: { allow_auto_merge: true, allow_forking: true, allow_merge_commit: false, allow_rebase_merge: false, allow_squash_merge: true, allow_update_branch: true, archive_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/{archive_format}{/ref}', archived: false, assignees_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/assignees{/user}', blobs_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/git/blobs{/sha}', branches_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/branches{/branch}', clone_url: 'https://github.com/Lightning-AI/lightning-thunder.git', collaborators_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/collaborators{/collaborator}', comments_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/comments{/number
auto-cc
The following actions uses Node.js version which is deprecated and will be forced to run on node20: Lightning-AI/probot@v5. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/