Optimizer state for pre-trained model #23638
Unanswered
VinaySingh561
asked this question in
Q&A
Replies: 3 comments 1 reply
-
I added some comment on your code. I don't suggent you to reset the optimizer state import optax
import pickle
# Here I see your optimizer:
learning_rate_schedule = optax.exponential_decay(
init_value=1e-2, # initial learning rate
transition_steps=900, # how often to decay
decay_rate=0.9, # the decay rate
staircase=True # if True, decay happens at discrete intervals
)
opt = optax.chain(
optax.scale_by_adam(),
optax.scale_by_schedule(learning_rate_schedule),
optax.scale(-1.0) # Gradient descent
)
# opt_state = opt.init(params) <-- Why do you want to reset the optimizer state?
# Here you load the parameters and optimizer state from the checkpoint
with open('best_energy_model_0.9978.pkl', 'rb') as f:
checkpoint = pickle.load(f)
# Load saved params and opt_state
params = checkpoint['params']
opt_state = checkpoint['opt_state'] # This will restores your optimizer's state
# Now you can continue training without resetting the optimizer |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks for your descriptive answer. Please clarify just one more point.
Even though I am changing the loss function.[ intial loss function is
(energy_loss + force_loss) and now updated loss function is
(energy_loss+100*force_loss)... But I can use opt_state saved from last
training right?
Thanks for your time.
…On Sat, 14 Sept, 2024, 4:57 am Howard Cho, ***@***.***> wrote:
I added some comment on your code. I don't suggent you to reset the
optimizer state opt_state = opt.init(params) while you are continuing
training. This will initialize the optimizer state. You may want to use
this when you are starting completely new training run or switching to a
new task.
import optaximport pickle
# Here I see your optimizer:learning_rate_schedule = optax.exponential_decay(
init_value=1e-2, # initial learning rate
transition_steps=900, # how often to decay
decay_rate=0.9, # the decay rate
staircase=True # if True, decay happens at discrete intervals
)
opt = optax.chain(
optax.scale_by_adam(),
optax.scale_by_schedule(learning_rate_schedule),
optax.scale(-1.0) # Gradient descent
)
# opt_state = opt.init(params) <-- Why do you want to reset the optimizer state?
# Here you load the parameters and optimizer state from the checkpointwith open('best_energy_model_0.9978.pkl', 'rb') as f:
checkpoint = pickle.load(f)
# Load saved params and opt_stateparams = checkpoint['params']opt_state = checkpoint['opt_state'] # This will restores your optimizer's state
# Now you can continue training without resetting the optimizer
—
Reply to this email directly, view it on GitHub
<#23638 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AUQD7SGAOOCHRWN23XLYIDLZWNYEJAVCNFSM6AAAAABOGHLXZOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANRUGI4DOOA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Ok, thanks for the clarification.
…On Sat, 14 Sept, 2024, 7:22 am Howard Cho, ***@***.***> wrote:
Yes, you can reuse the opt_state from your last training, even though
you're tweaking the loss function. The optimizer state (opt_state)
basically keeps track of things like momentum, learning rates, and other
internal stuff based on the parameters of your model.
—
Reply to this email directly, view it on GitHub
<#23638 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AUQD7SBOFPIMPL5OERK5GY3ZWOJGVAVCNFSM6AAAAABOGHLXZOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANRUGM2TAOI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I am running the e3nn model in JAX, and I have trained it for 100 epochs with a loss function that includes both energy and force. Currently, I am achieving a good R² score for energy. Now, I want to increase the weightage of the force loss in the training process. Therefore, I would like to use the parameters from my last trained model. However, I am unsure about how to initialize the optimizer state.
I have already saved the optimizer state and parameters from the previous model. Should I use the opt_state from the trained model or initialize it using opt.init(params)? I have attached the code below for your convenience.
Thank you for your time and consideration.
`import optax
learning_rate_schedule = optax.exponential_decay(
init_value=1e-2, # initial learning rate
transition_steps=900, # how often to decay
decay_rate=0.9, # the decay rate
staircase=True # if True, decay happens at discrete intervals
)
opt = optax.chain(
optax.scale_by_adam(),
optax.scale_by_schedule(learning_rate_schedule),
optax.scale(-1.0) # multiply by -1.0 to perform gradient descent
)
opt_state = opt.init(params)
with open('best_energy_model_0.9978.pkl', 'rb') as f:
checkpoint = pickle.load(f)
params = checkpoint['params']
opt_state = checkpoint['opt_state']`
Beta Was this translation helpful? Give feedback.
All reactions