Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot Load moonshotai/Moonlight-16B-A3B #36385

Open
2 of 4 tasks
Shomvel opened this issue Feb 25, 2025 · 1 comment
Open
2 of 4 tasks

Cannot Load moonshotai/Moonlight-16B-A3B #36385

Shomvel opened this issue Feb 25, 2025 · 1 comment
Labels

Comments

@Shomvel
Copy link

Shomvel commented Feb 25, 2025

System Info

  • transformers version: 4.47.1
  • Platform: Linux-5.15.0-124-generic-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.26.1
  • Safetensors version: 0.4.5
  • Accelerate version: 1.4.0
  • Accelerate config: - compute_environment: LOCAL_MACHINE
    - distributed_type: DEEPSPEED
    - use_cpu: False
    - debug: True
    - num_processes: 8
    - machine_rank: 0
    - num_machines: 1
    - rdzv_backend: static
    - same_network: True
    - main_training_function: main
    - enable_cpu_affinity: False
    - deepspeed_config: {'deepspeed_config_file': './training/configs/deepspeed.json', 'zero3_init_flag': False}
    - downcast_bf16: no
    - tpu_use_cluster: False
    - tpu_use_sudo: False
    - tpu_env: []
    - dynamo_config: {'dynamo_backend': 'INDUCTOR'}
  • PyTorch version (GPU?): 2.6.0+cu124 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: No

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "moonshotai/Moonlight-16B-A3B/"
model = AutoModelForCausalLM.from_pretrained(
    model_name, torch_dtype=torch.float16, trust_remote_code=True
)

The error is :


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[36], line 10
      8 # 加载模型和分词器
      9 model_name = "/gpfs/models/huggingface.co/moonshotai/Moonlight-16B-A3B/"
---> 10 model = AutoModelForCausalLM.from_pretrained(
     11     model_name, torch_dtype=torch.float16, trust_remote_code=True
     12 )

File ~/experiments/.venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:559, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    557     cls.register(config.__class__, model_class, exist_ok=True)
    558     model_class = add_generation_mixin_to_remote_model(model_class)
--> 559     return model_class.from_pretrained(
    560         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    561     )
    562 elif type(config) in cls._model_mapping.keys():
    563     model_class = _get_model_class(config, cls._model_mapping)

File ~/experiments/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py:4264, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, weights_only, *model_args, **kwargs)
   4254         load_contexts.append(tp_device)
   4256     with ContextManagers(load_contexts):
   4257         (
   4258             model,
   4259             missing_keys,
   4260             unexpected_keys,
   4261             mismatched_keys,
   4262             offload_index,
   4263             error_msgs,
-> 4264         ) = cls._load_pretrained_model(
   4265             model,
   4266             state_dict,
   4267             loaded_state_dict_keys,  # XXX: rename?
   4268             resolved_archive_file,
   4269             pretrained_model_name_or_path,
   4270             ignore_mismatched_sizes=ignore_mismatched_sizes,
   4271             sharded_metadata=sharded_metadata,
   4272             _fast_init=_fast_init,
   4273             low_cpu_mem_usage=low_cpu_mem_usage,
   4274             device_map=device_map,
   4275             offload_folder=offload_folder,
   4276             offload_state_dict=offload_state_dict,
   4277             dtype=torch_dtype,
   4278             hf_quantizer=hf_quantizer,
   4279             keep_in_fp32_modules=keep_in_fp32_modules,
   4280             gguf_path=gguf_path,
   4281             weights_only=weights_only,
   4282         )
   4284 # make sure token embedding weights are still tied if needed
   4285 model.tie_weights()

File ~/experiments/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py:4755, in PreTrainedModel._load_pretrained_model(cls, model, state_dict, loaded_keys, resolved_archive_file, pretrained_model_name_or_path, ignore_mismatched_sizes, sharded_metadata, _fast_init, low_cpu_mem_usage, device_map, offload_folder, offload_state_dict, dtype, hf_quantizer, keep_in_fp32_modules, gguf_path, weights_only)
   4748 if (
   4749     device_map is not None
   4750     and hf_quantizer is not None
   4751     and hf_quantizer.quantization_config.quant_method == QuantizationMethod.TORCHAO
   4752     and hf_quantizer.quantization_config.quant_type == "int4_weight_only"
   4753 ):
   4754     map_location = torch.device([d for d in device_map.values() if d not in ["cpu", "disk"]][0])
-> 4755 state_dict = load_state_dict(
   4756     shard_file, is_quantized=is_quantized, map_location=map_location, weights_only=weights_only
   4757 )
   4759 # Mistmatched keys contains tuples key/shape1/shape2 of weights in the checkpoint that have a shape not
   4760 # matching the weights in the model.
   4761 mismatched_keys += _find_mismatched_keys(
   4762     state_dict,
   4763     model_state_dict,
   (...)
   4767     ignore_mismatched_sizes,
   4768 )

File ~/experiments/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py:506, in load_state_dict(checkpoint_file, is_quantized, map_location, weights_only)
    504 with safe_open(checkpoint_file, framework="pt") as f:
    505     metadata = f.metadata()
--> 506 if metadata.get("format") not in ["pt", "tf", "flax", "mlx"]:
    507     raise OSError(
    508         f"The safetensors archive passed at {checkpoint_file} does not contain the valid metadata. Make sure "
    509         "you save your model with the `save_pretrained` method."
    510     )
    511 return safe_load_file(checkpoint_file)

AttributeError: 'NoneType' object has no attribute 'get'

I found all the safetensors file didn't have metadata, which you can test with

from safetensors import safe_open
for i in range(1, 28):
    tensors = safe_open(f"/model/path/moonshotai/Moonlight-16B-A3B/model-{i}-of-27.safetensors", framework="pt")
    metadata = tensors.metadata()
    print("Metadata:", metadata)

Expected behavior

The model loads.

@Shomvel Shomvel added the bug label Feb 25, 2025
@Rocketknight1
Copy link
Member

hi @Shomvel, have you raised the issue with the repository owners? I'm not sure this is a bug in transformers, because it's a custom code model with its own weights!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants