TransformerEngine - Intermediate tensor sharding #1025
Annotations
2 errors and 1 warning
auto-cc
Resource not accessible by integration
{
name: 'HttpError',
id: '9806863981',
status: 403,
response: {
url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695',
status: 403,
headers: {
'access-control-allow-origin': '*',
'access-control-expose-headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset',
connection: 'close',
'content-encoding': 'gzip',
'content-security-policy': "default-src 'none'",
'content-type': 'application/json; charset=utf-8',
date: 'Fri, 05 Jul 2024 10:24:17 GMT',
'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin',
server: 'istio-envoy',
'strict-transport-security': 'max-age=31536000; includeSubdomains; preload',
'transfer-encoding': 'chunked',
vary: 'Accept-Encoding, Accept, X-Requested-With',
'x-accepted-github-permissions': 'pull_requests=write',
'x-content-type-options': 'nosniff',
'x-envoy-decorator-operation': 'unicorn-api.github-production.svc.cluster.local:80/*',
'x-envoy-upstream-service-time': '82',
'x-frame-options': 'deny',
'x-github-api-version-selected': '2022-11-28',
'x-github-media-type': 'github.v3; format=json',
'x-github-request-id': '3003:BC5FA:3579A2D:61542EB:6687C9D0',
'x-ratelimit-limit': '15000',
'x-ratelimit-remaining': '14992',
'x-ratelimit-reset': '1720176418',
'x-ratelimit-resource': 'core',
'x-ratelimit-used': '8',
'x-xss-protection': '0'
},
data: {
message: 'Resource not accessible by integration',
documentation_url: 'https://docs.github.com/rest/pulls/pulls#update-a-pull-request',
status: '403'
}
},
request: {
method: 'PATCH',
url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695',
headers: {
accept: 'application/vnd.github.v3+json',
'user-agent': 'probot/12.2.5 octokit-core.js/3.6.0 Node.js/20.13.1 (linux; x64)',
authorization: 'token [REDACTED]',
'content-type': 'application/json; charset=utf-8'
},
body: '{"body":"TransformerEngine added the ability to shard intermediate activation tensors in v1.8. Currently, we save global/world sized activation for backward pass. Using this, we can lower the peak memory usage at the cost of added comms - as we will shard these intermediate tensor and gather them before the backward computation.\\r\\n\\r\\nTE PR: https://github.com/NVIDIA/TransformerEngine/pull/687\\r\\n\\r\\nIn this PR, we use make this option opt-in using thunder.jit compile argument - `fp8_shard_intermediate_activation`.\\r\\n\\r\\nExample usage: `model = thunder.jit(model, executors=executors, fp8_shard_intermediate_activation=True)`\\r\\n\\r\\n**Testing**\\r\\n\\r\\nUpdated the distributed test to use this option. Have tested with existing tests in `test_transformer_engine_executor.py` and `test_ddp.py -k transformer` with TE v1.7 (current stable), v1.8 and v1.9 (current main).\\r\\n\\r\\n**Benchmark**\\r\\nCommand - \\r\\n```\\r\\ntorchrun --nproc_per_node=8 --nnodes=1 thunder/benchmarks/benchmark_litgpt.py --return_metrics_as_json=True --json_path=/tmp/benchmark_litgpt_data.json --distributed_mode=fsdp --shard_mode=zero3 --model_name=Llama-2-7b-hf --micro_batch_size=1 --compile=thunder_inductor_cat_transformerengine_cudnn --nsys_enabled=False --dynamic=False\\r\\n```\\r\\n\\r\\nWithout FP8 Intermediate Sharding \\r\\n```\\r\\nAverage iter time: 282.47 ms\\r\\nMemory used: 52.92 GB\\r\\n```\\r\\n\\r\\n\\r\\nWith FP8 Intermediate Sharding \\r\\n```\\r\\nAverage iter time: 341.67 ms\\r\\nMemory used: 44.05 GB\\r\\n```\\r\\n\\r\\n<details>\\r\\n\\r\\n<summary> Patch to enable sharding in `benchmark_litgpt.py` </summary>\\r\\n\\r\\n```patch\\r\\ndiff --git a/thunder/benchmarks/benchmark_litgpt.py b/thunder/benchm
|
auto-cc
HttpError: Resource not accessible by integration
at /home/runner/work/_actions/Lightning-AI/probot/v5/node_modules/@octokit/core/node_modules/@octokit/request/dist-node/index.js:86:21
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Job.doExecute (/home/runner/work/_actions/Lightning-AI/probot/v5/node_modules/bottleneck/light.js:405:18)
{
name: 'AggregateError',
event: {
id: '9806863981',
name: 'pull_request',
payload: {
action: 'labeled',
label: {
color: '3855E2',
default: false,
description: '',
id: 6781712626,
name: 'distributed',
node_id: 'LA_kwDOLiCyD88AAAABlDi48g',
url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/labels/distributed'
},
number: 695,
organization: {
avatar_url: 'https://avatars.githubusercontent.com/u/58386951?v=4',
description: 'Turn ideas into AI, Lightning fast. Creators of PyTorch Lightning, Lightning AI Studio, TorchMetrics, Fabric, Lit-GPT, Lit-LLaMA',
events_url: 'https://api.github.com/orgs/Lightning-AI/events',
hooks_url: 'https://api.github.com/orgs/Lightning-AI/hooks',
id: 58386951,
issues_url: 'https://api.github.com/orgs/Lightning-AI/issues',
login: 'Lightning-AI',
members_url: 'https://api.github.com/orgs/Lightning-AI/members{/member}',
node_id: 'MDEyOk9yZ2FuaXphdGlvbjU4Mzg2OTUx',
public_members_url: 'https://api.github.com/orgs/Lightning-AI/public_members{/member}',
repos_url: 'https://api.github.com/orgs/Lightning-AI/repos',
url: 'https://api.github.com/orgs/Lightning-AI'
},
pull_request: {
_links: {
comments: {
href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/issues/695/comments'
},
commits: {
href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695/commits'
},
html: {
href: 'https://github.com/Lightning-AI/lightning-thunder/pull/695'
},
issue: {
href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/issues/695'
},
review_comment: {
href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/comments{/number}'
},
review_comments: {
href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695/comments'
},
self: {
href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/pulls/695'
},
statuses: {
href: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/statuses/b062aff346025ebb9e2fb313c6f55eed00abedb2'
}
},
active_lock_reason: null,
additions: 44,
assignee: null,
assignees: [],
author_association: 'COLLABORATOR',
auto_merge: null,
base: {
label: 'Lightning-AI:main',
ref: 'main',
repo: {
allow_auto_merge: true,
allow_forking: true,
allow_merge_commit: false,
allow_rebase_merge: false,
allow_squash_merge: true,
allow_update_branch: true,
archive_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/{archive_format}{/ref}',
archived: false,
assignees_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/assignees{/user}',
blobs_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/git/blobs{/sha}',
branches_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/branches{/branch}',
clone_url: 'https://github.com/Lightning-AI/lightning-thunder.git',
collaborators_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/collaborators{/collaborator}',
comments_url: 'https://api.github.com/repos/Lightning-AI/lightning-thunder/comments{/number
|
auto-cc
The following actions uses Node.js version which is deprecated and will be forced to run on node20: Lightning-AI/probot@v5. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|