-
Notifications
You must be signed in to change notification settings - Fork 84
Open
Description
Why there exists a statement checkpoint.clear() in LightningModelForTrain.on_save_checkpoint?
I think it would not prevent lightning from creating its own checkpoint because pl.Trainer.save_checkpoint would invoke DeepSpeedStrategy.save_checkpoint to produce a checkpoint even if checkpoint is cleared in on_save_checkpoint callback.
Moreover, this checkpoint can not be loaded by trainer.fit(ckpt_path=...) because of the missing of key pytorch-lightning_version, etc. The missing of these keys is caused by checkpoint.clear() in on_save_checkpoint callback. :(
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels