Skip to content

Commit 94724a9

Browse files
committed
Update README.md
1 parent 31c657a commit 94724a9

File tree

1 file changed

+13
-44
lines changed

1 file changed

+13
-44
lines changed

README.md

Lines changed: 13 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -74,50 +74,19 @@ For example (DDPG):
7474

7575
## Algorithms
7676

77-
- [x] [DQN](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)
78-
79-
![DQN](DQN/DQNAgent_200.gif)
80-
81-
- [x] [DDQN](https://arxiv.org/pdf/1509.06461.pdf)
82-
83-
![DDQN](DDQN/DDQNAgent_100.gif)
84-
85-
- [x] [DDPG](https://arxiv.org/pdf/1509.02971.pdf)
86-
87-
![DDPG](DDPG/DDPGAgent_200.gif)
88-
89-
- [x] [PPO](https://arxiv.org/pdf/1707.06347.pdf)
90-
91-
![PPO](PPO/PPOAgent_200.gif)
92-
93-
- [x] [Distributed Q learning (C51)](https://arxiv.org/pdf/1707.06887.pdf)
94-
95-
![C51](C51/C51Agent_100.gif)
96-
97-
- [x] [AWR](https://openreview.net/attachment?id=H1gdF34FvS&name=original_pdf)
98-
99-
![AWR](AWR/AWRAgent_200.gif)
100-
101-
- [x] [AC](https://proceedings.neurips.cc/paper/1999/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf)
102-
103-
![AC](AC/A2CAgent_600.gif)
104-
105-
- [x] [TD3](https://arxiv.org/pdf/1802.09477.pdf)
106-
107-
![TD3](TD3/TD3Agent_100.gif)
108-
109-
- improve `AWR`, `DDPG` `TD3` with Gumbel Distribution Regression from [`XQL`](https://div99.github.io/XQL):
110-
- XAWR
111-
112-
![XAWR](XAWR/XAWRAgent_100.gif)
113-
114-
- XDDPG
115-
116-
![XDDPG](XDDPG/XDDPGAgent_200.gif)
117-
118-
- XTD3
119-
120-
![XTD3](XTD3/XTD3Agent_100.gif)
77+
| model | paper link | After Training |
78+
| :---: | :----------------------------------------------------------------------------------: | :--------------------------------: |
79+
| DQN | https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf | ![DQN](DQN/DQNAgent_200.gif) |
80+
| DDQN | https://arxiv.org/pdf/1509.06461.pdf | ![DDQN](DDQN/DDQNAgent_100.gif) |
81+
| DDPG | https://arxiv.org/pdf/1509.02971.pdf | ![DDPG](DDPG/DDPGAgent_200.gif) |
82+
| PPO | https://arxiv.org/pdf/1707.06347.pdf | ![PPO](PPO/PPOAgent_200.gif) |
83+
| C51 | https://arxiv.org/pdf/1707.06887.pdf | ![C51](C51/C51Agent_100.gif) |
84+
| AWR | https://openreview.net/attachment?id=H1gdF34FvS | ![AWR](AWR/AWRAgent_200.gif) |
85+
| AC | https://proceedings.neurips.cc/paper/1999/file | ![AC](AC/A2CAgent_600.gif) |
86+
| TD3 | https://arxiv.org/pdf/1802.09477.pdf | ![TD3](TD3/TD3Agent_100.gif) |
87+
| XAWR | Improved with Gumbel Distribution Regression from [XQL](https://div99.github.io/XQL) | ![XAWR](XAWR/XAWRAgent_100.gif) |
88+
| XDDPG | Improved with Gumbel Distribution Regression from [XQL](https://div99.github.io/XQL) | ![XDDPG](XDDPG/XDDPGAgent_200.gif) |
89+
| XTD3 | Improved with Gumbel Distribution Regression from [XQL](https://div99.github.io/XQL) | ![XTD3](XTD3/XTD3Agent_100.gif) |
12190

12291
## Reference
12392

0 commit comments

Comments
 (0)