Skip to content

Conversation

mkhona-nvidia
Copy link
Contributor

This builds on previous PRs for PSGD's helper functions to make the PSGD-Kron-Pro optimizer

…contraction

Signed-off-by: mikail <mkhona@nvidia.com>
Signed-off-by: mikail <mkhona@nvidia.com>
Signed-off-by: mikail <mkhona@nvidia.com>
@mkhona-nvidia mkhona-nvidia self-assigned this Oct 15, 2025
Copy link

copy-pr-bot bot commented Oct 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: mikail <mkhona@nvidia.com>
@mkhona-nvidia mkhona-nvidia requested a review from skyw October 15, 2025 04:56
@mkhona-nvidia
Copy link
Contributor Author

/ok to test 780e3d7

Copy link
Contributor

@skyw skyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Comments are mostly style, although must fix.

Signed-off-by: mikail <mkhona@nvidia.com>
Signed-off-by: mikail <mkhona@nvidia.com>
@mkhona-nvidia mkhona-nvidia changed the title [DRAFT] PSGD-Kron-Pro(crustes) optimizer implementation PSGD-Kron-Pro(crustes) optimizer implementation Oct 17, 2025
Signed-off-by: mikail <mkhona@nvidia.com>
Signed-off-by: mikail <mkhona@nvidia.com>
@mkhona-nvidia
Copy link
Contributor Author

@https://github.com/lixilinx does this look good to you?

@lixilinx
Copy link

Thanks, @mkhona-nvidia for the PSGD code. It looks good and well organized to me!

I once verified the correctness of psgd_kron_contractions by comparison with einsum.

In the norm_lower_bound_spd, we will set the default subspace dim to 32 for float32 (based on my test).

skyw
skyw previously approved these changes Oct 20, 2025
Signed-off-by: mikail <mkhona@nvidia.com>
@mkhona-nvidia
Copy link
Contributor Author

Thanks, @mkhona-nvidia for the PSGD code. It looks good and well organized to me!

I once verified the correctness of psgd_kron_contractions by comparison with einsum.

In the norm_lower_bound_spd, we will set the default subspace dim to 32 for float32 (based on my test).

The momentum dampening has also been changed to:

Dampened momentum calculation:

dampened_momentum = exp_avg + (
    damping_noise_scale + torch.finfo(exp_avg.dtype).eps * exp_avg.abs()
) * torch.randn_like(exp_avg)

with a damping_noise_scale default of 1e-9

@mkhona-nvidia
Copy link
Contributor Author

/ok to test f9f12bd

Signed-off-by: mikail <mkhona@nvidia.com>
@mkhona-nvidia
Copy link
Contributor Author

/ok to test 54220e2

@mkhona-nvidia mkhona-nvidia merged commit f49d04e into NVIDIA-NeMo:main Oct 20, 2025
14 checks passed
@mkhona-nvidia mkhona-nvidia deleted the mkhona/psgd_kron_pro branch October 20, 2025 04:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants