Conversation
|
This pull request is being automatically deployed with Vercel (learn more). 🔍 Inspect: https://vercel.com/vwxyzjn/cleanrl/7NBoCmqbCsrTeVtqKRpZUxAGFZ8N |
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
Here are the benchmarked results. Looks like DQN in this PR gets better performance in Atari games, slightly worse results in Given these results, I recommend we merge this PR. This PR obtains overall better performance and removes an unverified code-level optimization: gradient norm clipping for DQN. @dosssman and @yooceii, does the result from this PR make sense to you? If it does, I will make updates to the docs and ultimately remove the old experiments. I think after this we would be ready for the 1.0 release. CC @araffin who might be interested in this :) Atari games
Classic control
MuJoCo
|
dosssman
left a comment
There was a problem hiding this comment.
Looking good on my side.
Thanks for the great work.
|
Merging now. |
* Fix the seed issue: see vwxyzjn#171 * Quick fix * log `episodic_length` * Fix vwxyzjn#172 * Fix vwxyzjn#148 and vwxyzjn#172-style problem for SAC * Add benchmark scripts * add sac script * Removes gradient clipping reference * use the latest reproduction script * Remove past reproducibility script * update documentation





















Description
This PR closes #171, closes #172, closes #168, and closes #148.
dqn.pyand others #171qf2#172episodic_lengthfor non-PPO scripts. #168nn.utils.clip_grad_norm_for DQN, DDPG, and TD3 #148Types of changes
Checklist:
pre-commit run --all-filespasses (required).mkdocs serve.If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See #137 as an example PR.
--capture-videoflag toggled on (required).mkdocs serve.width=500andheight=300).