Add callback for saving trainable parameters and model config #178

GirinMan · 2024-02-11T14:36:26Z

Overview

This PR is originated from Saving pytorch_model.bin with QLORA #123
I also faced similar problems with it but no one ever made commits for it...
I added a callback which saves not only adapter_model.bin but also trainable_params.bin and the configuration of backbone model(config.json) in order to reuse the configurations of rope scaling.

New callback: `SavePeftModelCallback`

In a new file named save_callback.py, I added a callback named SavePeftModelCallback, which saves trained weights and model config in a new directory.
The name of the directory is like f"{args.output_dir}/step-{state.global_step}}". The callback will automatically create if it doesn't exist, so that this callback can be used to store separate checkpoints at specific step intervals.

Changes in `merge_lora_weights_and_save_hf_model.py`

While loading a backbone model, the script didn't use the model config used during training, so the merged & saved checkpoint does not have information of the rope scaling configurations.
I guess this is why the config of LongLoRA models in huggingface hub do not contain any information with rope_scaling, even though they where changed during training. That's why I let SavePeftModelCallback to save the model's config too.
With changes in this PR, merge_lora_weights_and_save_hf_model.py will try to load and use the model config saved during training, which contains information about rope scaling.

See Llama-2-7b-longlora-8k/main/config.json

{
  "_name_or_path": "meta-llama/Llama-2-7b-hf",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 11008,
  "max_position_embeddings": 4096,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-05,
  "rope_scaling": null,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.31.0.dev0",
  "use_cache": true,
  "vocab_size": 32001
}

Thank you so much for sharing and maintaining such great research!
If you have any feedback, please feel free to...

GirinMan added 2 commits February 11, 2024 22:34

feat: Add callback saving trainable params

11f3aa4

feat: save & load model configs

db87668

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add callback for saving trainable parameters and model config #178

Add callback for saving trainable parameters and model config #178

Uh oh!

GirinMan commented Feb 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add callback for saving trainable parameters and model config #178

Are you sure you want to change the base?

Add callback for saving trainable parameters and model config #178

Uh oh!

Conversation

GirinMan commented Feb 11, 2024

Overview

New callback: SavePeftModelCallback

Changes in merge_lora_weights_and_save_hf_model.py

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New callback: `SavePeftModelCallback`

Changes in `merge_lora_weights_and_save_hf_model.py`