Skip to content

Conversation

@sanketpurandare
Copy link
Contributor

@sanketpurandare sanketpurandare commented Jan 22, 2026

stack-info: PR: #2270, branch: sanketpurandare/stack/1
@sanketpurandare sanketpurandare force-pushed the sanketpurandare/stack/1 branch from 902c95c to 6d42f9a Compare January 22, 2026 20:54
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 22, 2026
@sanketpurandare sanketpurandare requested a review from xmfan January 22, 2026 21:00
# 1. nn.Module.__init__() is already called by _DeepSeekV3Model.__init__
# 2. Calling ModelProtocol.__init__ after would reset all module state
# (nn.Module.__init__ clears _modules, _parameters, etc.)
_DeepSeekV3Model.__init__(self, model_args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will break when ModelProtocol.init does subclass init things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a better solution in mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could use torchtitan as the model repository, instead of autoparallel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, we'll do it after we have tested CP as well in auto parallel since that would alos require local map and is faster to iterate when we have model definition in AutoP

@sanketpurandare sanketpurandare merged commit 92ed38a into main Jan 29, 2026
13 of 19 checks passed
@tianyu-l tianyu-l deleted the sanketpurandare/stack/1 branch January 29, 2026 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants