-
Notifications
You must be signed in to change notification settings - Fork 40
lora/qlora training w/wo unsloth on NVIDIA L4 (~20GB VRAM usage) #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Lora/QLoRA training with unsloth
lora/qlora training w/wo unsloth on NVIDIA L4 (~20GB VRAM usage)
Prediction Code and Outputs for Unsloth QLoRAYou can find the working inference code for Unsloth QLoRA in this branch: The associated model is available here:
The pickle file contains a list of dictionaries for the test set, with the following fields: {
"image_id": image_id,
"width": width,
"height": height,
"output_text": output_text # predicted text from the model
} |
|
Closing this for now, I don't think anyone needs this. Let me know otherwise. |
Pull Request Summary #12
issues addressed
Major updates
Summary
dataclass → CLI args → final config.What Works
Common errors and resolutions
While Dataset loading if you get an error "Invalid pattern: '**' can only be an entire path component"
Refer this: https://stackoverflow.com/questions/77671277/valueerror-invalid-pattern-can-only-be-an-entire-path-component
AttributeError: 'Gemma3ModelOutputWithPast' object has no attribute 'loss': unslothai/unsloth#2656
Not Yet Tested
Packages
except colab:
if you are on colab, refer this:
commands used for training with unsloth on L4
In train.py, uncomment unsloth imports then run below command
commands used for training without unsloth on L4
In train.py, comment unsloth imports, make sure FastModel = None, then run below command
loss graphs
TODO