Optimizer state for pre-trained model #23638

VinaySingh561 · 2024-09-13T23:01:12Z

VinaySingh561
Sep 13, 2024

Hi,
I am running the e3nn model in JAX, and I have trained it for 100 epochs with a loss function that includes both energy and force. Currently, I am achieving a good R² score for energy. Now, I want to increase the weightage of the force loss in the training process. Therefore, I would like to use the parameters from my last trained model. However, I am unsure about how to initialize the optimizer state.

I have already saved the optimizer state and parameters from the previous model. Should I use the opt_state from the trained model or initialize it using opt.init(params)? I have attached the code below for your convenience.

Thank you for your time and consideration.

`import optax
learning_rate_schedule = optax.exponential_decay(
init_value=1e-2, # initial learning rate
transition_steps=900, # how often to decay
decay_rate=0.9, # the decay rate
staircase=True # if True, decay happens at discrete intervals
)

opt = optax.chain(
optax.scale_by_adam(),
optax.scale_by_schedule(learning_rate_schedule),
optax.scale(-1.0) # multiply by -1.0 to perform gradient descent
)

opt_state = opt.init(params)
with open('best_energy_model_0.9978.pkl', 'rb') as f:
checkpoint = pickle.load(f)
params = checkpoint['params']
opt_state = checkpoint['opt_state']`

bmaxdk · 2024-09-13T23:26:40Z

bmaxdk
Sep 13, 2024

I added some comment on your code. I don't suggent you to reset the optimizer state opt_state = opt.init(params) while you are continuing training. This will initialize the optimizer state. You may want to use this when you are starting completely new training run or switching to a new task.

import optax
import pickle

# Here I see your optimizer:
learning_rate_schedule = optax.exponential_decay(
    init_value=1e-2, # initial learning rate
    transition_steps=900, # how often to decay
    decay_rate=0.9, # the decay rate
    staircase=True # if True, decay happens at discrete intervals
)

opt = optax.chain(
    optax.scale_by_adam(), 
    optax.scale_by_schedule(learning_rate_schedule),
    optax.scale(-1.0)  # Gradient descent
)

# opt_state = opt.init(params) <-- Why do you want to reset the optimizer state?

# Here you load the parameters and optimizer state from the checkpoint
with open('best_energy_model_0.9978.pkl', 'rb') as f:
    checkpoint = pickle.load(f)

# Load saved params and opt_state
params = checkpoint['params']
opt_state = checkpoint['opt_state']  # This will restores your optimizer's state

# Now you can continue training without resetting the optimizer

0 replies

VinaySingh561 · 2024-09-14T00:47:31Z

VinaySingh561
Sep 14, 2024
Author

Thanks for your descriptive answer. Please clarify just one more point. Even though I am changing the loss function.[ intial loss function is (energy_loss + force_loss) and now updated loss function is (energy_loss+100*force_loss)... But I can use opt_state saved from last training right? Thanks for your time.

…

On Sat, 14 Sept, 2024, 4:57 am Howard Cho, ***@***.***> wrote: I added some comment on your code. I don't suggent you to reset the optimizer state opt_state = opt.init(params) while you are continuing training. This will initialize the optimizer state. You may want to use this when you are starting completely new training run or switching to a new task. import optaximport pickle # Here I see your optimizer:learning_rate_schedule = optax.exponential_decay( init_value=1e-2, # initial learning rate transition_steps=900, # how often to decay decay_rate=0.9, # the decay rate staircase=True # if True, decay happens at discrete intervals ) opt = optax.chain( optax.scale_by_adam(), optax.scale_by_schedule(learning_rate_schedule), optax.scale(-1.0) # Gradient descent ) # opt_state = opt.init(params) <-- Why do you want to reset the optimizer state? # Here you load the parameters and optimizer state from the checkpointwith open('best_energy_model_0.9978.pkl', 'rb') as f: checkpoint = pickle.load(f) # Load saved params and opt_stateparams = checkpoint['params']opt_state = checkpoint['opt_state'] # This will restores your optimizer's state # Now you can continue training without resetting the optimizer — Reply to this email directly, view it on GitHub <#23638 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AUQD7SGAOOCHRWN23XLYIDLZWNYEJAVCNFSM6AAAAABOGHLXZOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANRUGI4DOOA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

1 reply

bmaxdk Sep 14, 2024

Yes, you can reuse the opt_state from your last training, even though you're tweaking the loss function. The optimizer state (opt_state) basically keeps track of things like momentum, learning rates, and other internal stuff based on the parameters of your model.

VinaySingh561 · 2024-09-14T05:42:51Z

VinaySingh561
Sep 14, 2024
Author

Ok, thanks for the clarification.

…

On Sat, 14 Sept, 2024, 7:22 am Howard Cho, ***@***.***> wrote: Yes, you can reuse the opt_state from your last training, even though you're tweaking the loss function. The optimizer state (opt_state) basically keeps track of things like momentum, learning rates, and other internal stuff based on the parameters of your model. — Reply to this email directly, view it on GitHub <#23638 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AUQD7SBOFPIMPL5OERK5GY3ZWOJGVAVCNFSM6AAAAABOGHLXZOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANRUGM2TAOI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizer state for pre-trained model #23638

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Optimizer state for pre-trained model #23638

VinaySingh561 Sep 13, 2024

Replies: 3 comments · 1 reply

bmaxdk Sep 13, 2024

VinaySingh561 Sep 14, 2024 Author

bmaxdk Sep 14, 2024

VinaySingh561 Sep 14, 2024 Author

VinaySingh561
Sep 13, 2024

Replies: 3 comments 1 reply

bmaxdk
Sep 13, 2024

VinaySingh561
Sep 14, 2024
Author

VinaySingh561
Sep 14, 2024
Author