-
Notifications
You must be signed in to change notification settings - Fork 1
Description
While utilizing your codebase for research involving the UNDO framework, I identified a potential configuration error in the arithmetic relearning script.
Location: run_relearn_arithmetic.py, Specific Section: setups["gemma-2-0.3B_train_only_forget"].
The Issue: In the current implementation, the first_train_file for the train_only_forget setup is defined as: 'first_train_file': f"{DATASET_DIR}/pretrain/train_addition_subtraction.jsonl"
However, based on the paper's methodology, the "Forget Set" for arithmetic unlearning consists of multiplication and division tasks, while addition and subtraction constitute the Retain Set.
Impact: As currently configured, the script measures the model's ability to "relearn" the Retain Set (knowledge it was never intended to lose) rather than the recovery rate of the suppressed Forget Set. This results in misleading metrics when evaluating the robustness of unlearning methods against relearning attacks.
Suggested Fix: Update the path to point to the actual forget-set data: 'first_train_file': f"{DATASET_DIR}/pretrain/train_multiplication_division.jsonl"
Thank you for providing this codebase and for your contribution to the field of unlearning!