Are Ziebart's thesis, equation 9.2 and find_policy() function the same? #15

tessavdheiden · 2022-06-07T10:20:58Z

Hi Matthew!

This repo is just great: It works, its transparant and modular!

I only found two differences between Ziebart's thesis and your implementation.
Can you let me know if you were aware of them?

So here is Eq 9.2:

Here is your code:

And here is Eq 9.1:

Which uses $V^{\text{soft}}$:

And here is your code:

You include a discount factor in Eq 9.2, and in 9.1 you convert a subtraction ($Q^{\text{soft}}-V^{\text{soft}}$) into a fraction ($\frac{Q^{\text{soft}}}{V^{\text{soft}}}$), correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are Ziebart's thesis, equation 9.2 and find_policy() function the same? #15

Are Ziebart's thesis, equation 9.2 and find_policy() function the same? #15

tessavdheiden commented Jun 7, 2022

Are Ziebart's thesis, equation 9.2 and find_policy() function the same? #15

Are Ziebart's thesis, equation 9.2 and find_policy() function the same? #15

Comments

tessavdheiden commented Jun 7, 2022