Save a checkpoint when we press on Stop Training despite the Save Every N Epochs setting & Training optimization

**Is your feature request related to a problem? Please describe.**
I have an old GPU (GTX 1080Ti 11GB) + 32GB of RAM and i tested how it performs on training a LoRa. It estimated  167h and 50 min for the training get finished.  I didn't expected this this slow, so at some point i stopped training and it haven't reached the first checkpoint save setting yet. So i can't resume on what i already trained. The 1 hour is lost. 

**Describe the solution you'd like**
When i stop training it should safe a checkpoint. You never know when or why you wanna stop a training. So it would be very helpful to not having to rely on a predicted value setting for Save Every N Epochs.

**Describe alternatives you've considered**
Buying a new GPU

**Additional context**
I set 600 Epochs for 70 instrumental song files
<img width="1558" height="643" alt="Image" src="https://github.com/user-attachments/assets/54fa0395-5d9d-467c-b00f-3c55aa618668" />
I trained a Flux LoRA before which took a night. So there is room for optimization i think.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save a checkpoint when we press on Stop Training despite the Save Every N Epochs setting & Training optimization #606

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Save a checkpoint when we press on Stop Training despite the Save Every N Epochs setting & Training optimization #606

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions