Skip to content

size mismatch for embed.weight #2

@YNCao

Description

@YNCao

Hello, Thanks for your great work!

I met an error when loading the "state_dict" of the O1-nano model, it seems the vocab_size in your released parameter is 39 instead of 40. Could you please check it out? btw, the model still doesn't work if I simply change the vocab_size to 39.
RuntimeError: Error(s) in loading state_dict for O1Model: size mismatch for embed.weight: copying a param with shape torch.Size([39, 64]) from checkpoint, the shape in current model is torch.Size([40, 64]). size mismatch for completion_decoder.weight: copying a param with shape torch.Size([39, 64]) from checkpoint, the shape in current model is torch.Size([40, 64]). size mismatch for completion_decoder.bias: copying a param with shape torch.Size([39]) from checkpoint, the shape in current model is torch.Size([40]). size mismatch for reasoning_decoder.weight: copying a param with shape torch.Size([39, 64]) from checkpoint, the shape in current model is torch.Size([40, 64]). size mismatch for reasoning_decoder.bias: copying a param with shape torch.Size([39]) from checkpoint, the shape in current model is torch.Size([40]).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions