Apply Gradient Not Behaving appropriately #1030

vasilavramov · 2021-02-17T18:11:41Z

vasilavramov
Feb 17, 2021

Hi, I am currently trying to train a NN through reinforcement learning and I am struggling to update the NN parameters once an epoch of training is complete. The gradients are computed as follows:

`def compute_gradients_EV(optimizer, input, params):
def EV_short(params):
return policy_EV(input, params)

policy_fit, grad = jax.value_and_grad(EV_short, has_aux = False)(params)

return policy_fit, grad
Error unsupported operand type(s) for *: 'float' and 'FrozenDict'`

The error I get makes no sense given that in the base documentation of the optim class it says that apply gradients should work with a pytree of gradients. I have tried a different method, which is similar to the one showed in examples where the jax.value_and_grad function is called with the model and the using optimiser.target to differentiate. However, in that case I get the error that the model with which I am trying to call the differentiating function 'is not a valid Jax type'.
From what I understand I have successfully computed the gradients, however updating them seems impossible using optimizer.apply_gradients(), but it should be. Anyway, any help would be much appreciated.

avital · 2021-02-17T18:43:34Z

avital
Feb 17, 2021

Hi @vasilavramov, I think you'll have to ask your question more concretely and/or share code to get more help. My first guess is that you're doing something like learning_rate * gradients where you should be using jax.tree_map or jax.tree_multimap somewhere

1 reply

vasilavramov Feb 17, 2021
Author

Thanks for the quick reply. I am using the Gradient Descent optimiser. Upon a training step, I would like to update neural network parameters based on their gradients that I have computed using the function in the other comment. I try to do that using the .apply_gradient method. A snippet of the code that is supposed to update the NN parameters to the new value can be found below.
for epoch_index in range(n_epochs): policy_score_temp, gradients = compute_gradients_EV(self.policy.optimizer, self.envs, self.policy.params) self.policy.optimizer = self.policy.optimizer.apply_gradient(gradients)

Full links to the code being used are:
https://colab.research.google.com/drive/1XbBmvxwBJ5r0NAPTY5n9owBlpSw18BbM?usp=sharing
https://colab.research.google.com/drive/11IZzCLPHBWFxvU7JyDQGGz2-uhDeU80U?usp=sharing
https://colab.research.google.com/drive/16ORLIVciQ_G5SvIOYSjdzyxLsztIz0x9?usp=sharing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply Gradient Not Behaving appropriately #1030

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Apply Gradient Not Behaving appropriately #1030

vasilavramov Feb 17, 2021

Replies: 1 comment · 1 reply

avital Feb 17, 2021

vasilavramov Feb 17, 2021 Author

vasilavramov
Feb 17, 2021

Replies: 1 comment 1 reply

avital
Feb 17, 2021

vasilavramov Feb 17, 2021
Author