Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new feature of SafeLoRA #2201

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open

Conversation

chiayi-hsu
Copy link

The pull request was closed due to syncing with the latest version of PEFT, so I have requested the pull request again.
I have made all the necessary changes based on our previous conversations in this version.

If there are any issues, please let me know.

Thank you.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update to the SafeLoRA PR. I did another review and found a few areas to improve. Please take a look. Also, please run make style once you're finished with your changed.

examples/safelora/README.md Outdated Show resolved Hide resolved
examples/safelora/README.md Outdated Show resolved Hide resolved
save_weights=True)

final_lora_weight = apply_safelora(config)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a bit more to the example. For instance, how to save and load these weights?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added more descriptions to the example.
If you feel there are still any missing parts, please let me know.

Comment on lines 15 to 16
config = SafeLoraConfig(base_model_path='../LLM_Models/llama-2-7b-hf/',\
aligned_model_path='../LLM_Models/llama-2-7b-chat-fp16/',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the HF model ids for these two.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has been modified.

Comment on lines 215 to 217
peft_weights = {name: f.get_tensor(name).to(safelora_config.dtype) for name in f.keys()}
else:
peft_weights = {name: f.get_tensor(name).to(safelora_config.dtype) for name in f.keys()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 lines are identical

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has been modified.

- if (safelora_config.devices).lower() == "cpu":
-        peft_weights = {name: f.get_tensor(name).to(safelora_config.dtype) for name in f.keys()}
- else:
-        peft_weights = {name: f.get_tensor(name).to(safelora_config.dtype) for name in f.keys()}
+ peft_weights = {name: f.get_tensor(name).to(safelora_config.dtype) for name in f.keys()}

]
align_model_parameters = [
name for name in sl_align.weight_map.keys() if any(v in name for v in list(peft_config.target_modules))
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check that base_model_parameters and align_model_parameters are the same?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a check to verify if the model weights are the same.

+ if (sl_base.get_tensor(name_base) == sl_align.get_tensor(name_align)).all():
+        raise ValueError("The weights of the base Model and the aligned Model should be different.")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something else. Would we expect that base_model_parameters == align_model_parameters? If not, under what circumstances would they differ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open.

return safety_vector


def project_weights(configs, peft_weights, v):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename configs to config or safelora_config.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has been modified.

metadata={"help": "The path of the LoRA wieghts and configs."},
)

select_layers_type: str = field(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of str, we can annotate this as Literal["threshold", "number"].

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has been modified.

src/peft/utils/safelora.py Outdated Show resolved Hide resolved
select_layers_type='threshold',
save_weights=True)

final_lora_weight = apply_safelora(config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example should show inference, here we only create the weights. What are the next steps?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added more explanations in the README.md and also included code on how to use the SafeLoRA model.

chiayi-hsu and others added 6 commits November 9, 2024 02:20
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
@BenjaminBossan
Copy link
Member

@chiayi-hsu Once you're finished with your changes and want me to give another review, please ping me.

@chiayi-hsu
Copy link
Author

@BenjaminBossan I have completed the modifications. Please help review them. Thanks!

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the updates. I did another review. Most of what I found are just smaller things like docs, please take a look.

Now as a next step, it is important that we also add some unit tests. This not going to be very straightforward, because we cannot easily test model alignment and we also don't want to use any big models during unit testing.

One proposal for this would be to use a small model like hf-internal-testing/tiny-random-OPTForCausalLM as the base model. Then let's modify some weights (setting them to 0?) and save this as the "aligned" model. Then call apply_safelora with these 2 models and various options to see if those tests pass. This would not really check the alignment though.

In addition, we could think about adding a true alignment test for the nightly run with GPU. For this test, it would be okay to use a bigger model (but ideally still not too big).

LMK what you think about this testing strategy and if you have further questions.

Apart from this, please call make style on your PR, as this is a prerequisite for the CI to pass.

src/peft/utils/safelora.py Show resolved Hide resolved
Comment on lines 62 to 72
default="meta-llama/Llama-2-7b-hf",
metadata={"help": "The path of the base model for obtaining the aligned matrix."},
)

aligned_model_path: str = field(
default="TheBloke/Llama-2-7B-Chat-fp16",
metadata={"help": "The path of the aligned model for obtaining the aligned matrix."},
)

peft_model_path: str = field(
default="LisaSchunke/llama-2-7b-peft-finetuned-20000-dataset",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it doesn't make sense to set default values here, I would remove them. WDYT?


peft_model_path: str = field(
default="LisaSchunke/llama-2-7b-peft-finetuned-20000-dataset",
metadata={"help": "The path of the LoRA wieghts and configs."},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
metadata={"help": "The path of the LoRA wieghts and configs."},
metadata={"help": "The path of the LoRA weights and config."},

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo is still there.

src/peft/utils/safelora.py Outdated Show resolved Hide resolved
src/peft/utils/safelora.py Outdated Show resolved Hide resolved
After fine-tuning large language models (LLMs) using LoRA, the alignment of the resulting models may decrease.
Therefore, applying `apply_safelora()` is intended to help preserve the alignment of the final models.

It is important to note that the model weights of the aligned model and the base model must be of the same size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also mention that right now, only safetensors format is supported.

)

with safe_open(
f"{os.path.join(safelora_config.peft_model_path, 'adapter_model.safetensors')}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not hard-code adapter_model.safetensors, let's use peft.utils.constants.SAFETENSORS_WEIGHTS_NAME.

final_weights, _ = project_weights(safelora_config, peft_weights, projected_matrix)

if safelora_config.save_weights:
save_file(final_weights, f"{os.path.join(safelora_config.peft_model_path, 'adapter_model.safetensors')}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not hard-code adapter_model.safetensors, let's use peft.utils.constants.SAFETENSORS_WEIGHTS_NAME.

examples/safelora/README.md Outdated Show resolved Hide resolved
examples/safelora/README.md Outdated Show resolved Hide resolved
chiayi-hsu and others added 11 commits November 25, 2024 21:32
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@chiayi-hsu
Copy link
Author

chiayi-hsu commented Dec 27, 2024 via email

@chiayi-hsu
Copy link
Author

@BenjaminBossan
I have made the modifications based on your review and ensured that the style passes by running make style. Regarding the unit test, we can follow your suggestion to use a small model like hf-internal-testing/tiny-random-OPTForCausalLM and modify part of the weights in each layer to be 0, treating it as the so-called aligned model. In our setting, we require that the weights of the base and aligned models be entirely different because we want the aligned model to be a fully fine-tuned model obtained through RLHF techniques. This ensures that the aligned matrix obtained is more flexibly applicable to various attention layers where LoRA is applied.

I would like to ask about the unit test part. Do I need to write test scripts for this? Are there any specific rules or things I should pay attention to?

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the updates.

I reviewed the PR again and found a few more things that can be improved, it's mostly about clarity and formatting.

Regarding the unit tests:

I think it would be easiest to proceed as follows. Let's create a new file, tests/test_safelora.py. Next, I created a small template for you to get started:

class TestSafeLora:
    model_id = "hf-internal-testing/tiny-random-OPTForCausalLM"

    @pytest.fixture(scope="class")
    def aligned_model_path(self, tmp_path):
        # we create a fake aligned model where the weights differ from the base model
        model = AutoModelForCausalLM.from_pretrained(self.model_id)
        for param in model.parameters():
            # modify the parameters to be different
            ...

        model.save_pretrained(tmp_path / "aligned_model")

    @pytest.fixture
    def lora_path(self, tmp_path):
        # create a LoRA adapter
        model = AutoModelForCausalLM.from_pretrained(self.model_id)
        lora_config = LoraConfig(init_lora_weights=False)  # initialize LoRA so that it's not a no-op
        model = get_peft_model(model, lora_config)
        model.save_pretrained(tmp_path / "lora")

    def test_safelora_with_threshold(self, aligned_model_path, lora_path):
        ...  # code to test

    def test_safelora_with_num_proj_layers(self, aligned_model_path, lora_path):
        ...  # code to test

    def test_safelora_with_save_weights_false(self, aligned_model_path, lora_path):
        ...  # code to test

    # etc. more tests for the different options here

I hope this makes sense, if not, feel free to ask questions.

As discussed earlier, since this does not use a real aligned model but just creates a fake one, we cannot really test if the final model is better aligned or not. For this, we rely on the paper results and assume they're correct. If you have a good idea for a real alignment test, we can also add that test though. Just ensure that we're only using very small models to not slow down the CI.

@@ -0,0 +1,46 @@
# Safe LoRA

The official code of Safe LoRA: The Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a sentence or two about what Safe LoRA does and when it could be interesting for users to use it, similar to the beginning of the docstring of apply_safelora.

src/peft/utils/safelora.py Outdated Show resolved Hide resolved

from peft.utils.safelora import SafeLoraConfig, apply_safelora

peft_path = "../finetuneLLM/finetuned_models/samsumBad-7b-fp16-peft-seed-42"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also put a placeholder for this path. In this case, it's the same as <SafeLoRA-path> below, right?

Comment on lines +17 to +18
base_model_path="meta-llama/Llama-2-7b-hf",
aligned_model_path="TheBloke/Llama-2-7B-Chat-fp16",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we use concrete exmaples like Llama2 7b. Below, we use a placeholder for the model id: <base-model-id>, which should correspond to the base_model_path. Let's make this consistent: Either use placeholders here (which I prefer) or use the real model path below.

from safetensors.torch import save_file

path = ... # your PEFT model path
save_file(final_lora_weight, os.path.join(path, "adapter_model.safetensors"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, path would be the same as peft_path above, right? If so, let's use the same name.

Comment on lines +126 to +131
if self.base_model_path is None:
raise ValueError("base_model_path cannot be None.")
if self.aligned_model_path is None:
raise ValueError("aligned_model_path cannot be None.")
if self.peft_model_path is None:
raise ValueError("peft_model_path cannot be None.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are no longer any default values for these fields, I don't believe we need to perform these checks anymore.

]
align_model_parameters = [
name for name in sl_align.weight_map.keys() if any(v in name for v in list(peft_config.target_modules))
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open.

"The dimensions of the base model's weight should be the same with the aligned model's weight."
)
if (sl_base.get_tensor(name_base) == sl_align.get_tensor(name_align)).all():
raise ValueError("The weights of the base Model and the aligned Model should be different.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the difference between aligned model and base model. However, it should be possible to align a model without changing each and every parameter, right? E.g., in the future we could a new model where the aligned model only changes half of the weights, I don't see why that couldn't be possible. Does the SafeLoRA algorithm really require each parameter to be different? Can we not skip the layers if the parameters are identical?


def apply_safelora(safelora_config: SafeLoraConfig):
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

safelora_config: The config of SafeLora.

Returns:
`torch.nn.Module`: The Lora model is applied SafeLoRA.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, the return value is a state_dict containing the PEFT weights.

chiayi-hsu and others added 2 commits January 16, 2025 00:15
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants