[applications/ColossalChat/examples/training_scripts/lora_finetune.py]: Fixed bug, added save_interval and added auto resume functions #6223

bbbolt · 2025-02-26T12:57:15Z

📌 Checklist before creating the PR

[N] I have created an issue for this PR for traceability
[Y] The title follows the standard format: [doc/gemini/tensor/...]: A concise description
[Y] I have added relevant tags if possible for us to better distinguish different PRs
[Y] I have installed pre-commit: pip install pre-commit && pre-commit install

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge
e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

Summarize your work here.

Fix the gradient not enabled bug during lora fine-tuning, by adding the code "model.enable_input_require_grads()".

Add interval save function for lora and optimizer state, controled by save_interval argument

Add auto resume function, controled by lora_path and optmizer_path arguments.
if you have any plots/diagrams/screenshots/tables, please attach them here.

💥 Checklist before requesting a review

[Y] I have linked my PR to an issue (instruction)
[Y] My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
[Y] I have performed a self-review of my code
[Y] I have added thorough tests.
[Y] I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

[Y] 🌝 Yes, I do.
🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

1. Fixed the gradient not enabled bug during lora fine-tuning, Add the code "model.enable_input_require_grads()". 2. added interval save function for lora and optimizer state, Controled by save_interval argument 3. and auto resume function. Controled by lora_path and optmizer_path arguments.

for more information, see https://pre-commit.ci

bbbolt

fix the format

bbbolt requested a review from a team as a code owner February 26, 2025 12:57

[pre-commit.ci] auto fixes from pre-commit.com hooks

79808ea

for more information, see https://pre-commit.ci

bbbolt commented Feb 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[applications/ColossalChat/examples/training_scripts/lora_finetune.py]: Fixed bug, added save_interval and added auto resume functions #6223

[applications/ColossalChat/examples/training_scripts/lora_finetune.py]: Fixed bug, added save_interval and added auto resume functions #6223

bbbolt commented Feb 26, 2025

bbbolt left a comment

[applications/ColossalChat/examples/training_scripts/lora_finetune.py]: Fixed bug, added save_interval and added auto resume functions #6223

Are you sure you want to change the base?

[applications/ColossalChat/examples/training_scripts/lora_finetune.py]: Fixed bug, added save_interval and added auto resume functions #6223

Conversation

bbbolt commented Feb 26, 2025

📌 Checklist before creating the PR

🚨 Issue number

📝 What does this PR do?

💥 Checklist before requesting a review

⭐️ Do you enjoy contributing to Colossal-AI?

bbbolt left a comment

Choose a reason for hiding this comment