From e3c503128af52aee5a6070bcf89f1fd307bb867e Mon Sep 17 00:00:00 2001 From: leeyngdo Date: Mon, 24 Feb 2025 09:46:19 +0000 Subject: [PATCH 1/3] Update Project Page --- docs/dataset/.DS_Store | Bin 0 -> 6148 bytes docs/dataset/css/bulma.min.css | 20 +++++++++++ docs/dataset/css/style.css | 12 ++++++- docs/dataset/index.html | 56 ++++++++++++++++-------------- docs/dataset/js/bulma-carousel.js | 1 + docs/dataset/videos/.DS_Store | Bin 0 -> 6148 bytes docs/index.html | 39 +++++++++++++-------- docs/style.css | 3 +- docs/thumbnails/0.png | Bin 83701 -> 339030 bytes docs/thumbnails/1.png | Bin 52050 -> 502495 bytes docs/thumbnails/2.png | Bin 38813 -> 329864 bytes docs/thumbnails/3.png | Bin 39898 -> 325698 bytes docs/thumbnails/4.png | Bin 36814 -> 334990 bytes docs/thumbnails/5.png | Bin 79338 -> 421877 bytes docs/thumbnails/6.png | Bin 42188 -> 377622 bytes docs/thumbnails/7.png | Bin 48470 -> 416325 bytes docs/thumbnails/8.png | Bin 47957 -> 0 bytes docs/thumbnails/9.png | Bin 42032 -> 0 bytes docs/videos/.DS_Store | Bin 0 -> 6148 bytes 19 files changed, 87 insertions(+), 44 deletions(-) create mode 100644 docs/dataset/.DS_Store create mode 100644 docs/dataset/videos/.DS_Store delete mode 100644 docs/thumbnails/8.png delete mode 100644 docs/thumbnails/9.png create mode 100644 docs/videos/.DS_Store diff --git a/docs/dataset/.DS_Store b/docs/dataset/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..f17e7db8c0ca56aa1562240a8e45872e74813ce8 GIT binary patch literal 6148 zcmeHKyGjE=6uq+vvZ6w;vYa2Vh@IUSR=2RaOw%?|i-=-4W3-Op2je_@%_1M!1S(#m z-5uoFxK<~D)m4DsU4sTRp~>hpe1E$~i%ga$Nq?005sBaKx3;gI-VefT`b&0v7Ora0 zMiE_9*wpdtJ05cFJ>1;xKU*%_ljRGhYOchidd#y%mvl}k4Johi&F(&``H=VIbNX`q zXn9L(g#62DO84<<_&&r$0Z~8{STzN(XS2l{4rLStL;+D?t^j`@G@Q|REF9WT2L^it z05%b}hB==lxJP?59t(%az?5qRx>n_n7|OLHu6>>HSU7a;q%1Sm@yyDfP?TjyT-$O| z#-WU&fGAK_py4iCy#F6AKmV7THome
- Code + Code
Dataset @@ -51,11 +51,18 @@

Dataset

- We open-source all data corresponding to the 80-task and 30-task datasets used in our multi-task experiments. They can be downloaded below. The two datasets contain 545M and 345M transitions, respectively. + We're open-sourcing $57$-task datasets from the SimbaV2 agent's replay buffers, available for download below. We hope this release encourages other research groups to share their datasets and checkpoints, driving collaboration and progress.

+
+ Click to Download:
+

@@ -63,7 +70,7 @@

Dataset

Overview

- We release two multi-task datasets with data from 80 and 30 tasks, respectively. The datasets are summarized below. + We release a total of $57$ single-task expert datasets with transition data from the replay buffers of single-task SimbaV2 agents. The domains consist of MuJoCo (5), DMControl (28), MyoSuite (10), and HumanoidBench (14), encompassing a variety of locomotion and manipulation tasks with varying levels of complexity. The dataset details are summarized below.

@@ -89,19 +96,19 @@

Overview

5 @@ -116,19 +123,19 @@

Overview

28 @@ -149,13 +156,13 @@

Overview

690k @@ -176,29 +183,24 @@

Overview

690k
- 4 + 5 690k - 345M + 171M - 20GB + 11.4GB - +
- 11 + 13 690k - 345M + 171M 20GB - +
- 345M + 171M - 20GB + 14.3GB - +
- 345M + 171M - 20GB + 11.1GB - +
-
-

- These datasets are obtained from the replay buffers of 240 single-task SimbaV2 agents, and thus contain a wide variety of behaviors ranging from random to expert policies. Multi-task agents trained on the above datasets, as well as checkpoints for each of the single-task agents, are available here. -

-

Tasks

- We list all 57 tasks in the dataset below, along with SimbaV2 policy visualizations and summary statistics. + We list all $57$ tasks in the dataset below, along with SimbaV2 policy visualizations and: summary statistics.

@@ -248,7 +250,7 @@

Tasks

- +
@@ -552,7 +554,7 @@

Tasks

- +
@@ -1098,7 +1100,7 @@

Tasks

- +
@@ -1384,7 +1386,7 @@

Tasks

- +
diff --git a/docs/dataset/js/bulma-carousel.js b/docs/dataset/js/bulma-carousel.js index 28bb55b..7456808 100644 --- a/docs/dataset/js/bulma-carousel.js +++ b/docs/dataset/js/bulma-carousel.js @@ -1036,6 +1036,7 @@ var defaultOptions = { autoplaySpeed: 3000 }; + var Autoplay = function (_EventEmitter) { _inherits(Autoplay, _EventEmitter); diff --git a/docs/dataset/videos/.DS_Store b/docs/dataset/videos/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..9705024063dc8578edc36f6c4a326d9f488ef62e GIT binary patch literal 6148 zcmeHK%SyvQ6uo1s1Vt!xlo%3=h<_%Vk6wFc#dkL zl_q^1t>>yGAPR^A|4jjYcMH^}gj&5b_x-J$%rZ$D$IV{aL?jrFC&R(h_`uD!f64Z{ z?yOU^(HeEBNgcYR7InQ<_okDdd3Wvl+ui&9NU!b9k)$&Qi)uU``^SbyLIb*>6g*Oo zM`d+a^KkMmkJ0nd>%$ihuV%($QH{r=vdugKsDBFzR0vK7$3?c%P9|0|c45GlFD)0$Q0FhS! literal 0 HcmV?d00001 diff --git a/docs/index.html b/docs/index.html index 2f70302..0f1311c 100644 --- a/docs/index.html +++ b/docs/index.html @@ -3,6 +3,7 @@ + @@ -31,7 +32,7 @@

Under Review

- Hojoon Lee1, 2$\dagger$,  + Hojoon Lee1$\dagger$,  Youngdo Lee1$\dagger$,  Takuma Seno2Donghu Kim1
@@ -52,7 +53,7 @@

@@ -69,7 +70,7 @@

TL;DR

- Stop worrying about algorithms, just change the network architecture to SimbaV2 + Stop worrying about algorithms, just change the network architecture to SimbaV2

@@ -151,7 +152,7 @@

Scaling Network Size & UTD Ratio

Empiricial Analysis: Training Stability

- We track $4$ metrics during training to understand the learning dynamics of SimbaV2 and Simba: + We track average return and $4$ metrics during training to understand the learning dynamics of SimbaV2 and Simba:

  • (a) Average normalized return across tasks

  • (b) Weighted sum of $\ell_2$-norms of all intermediate features in critics

  • @@ -281,7 +282,7 @@

    SimbaV2 with Online RL

    Paper

    SimbaV2: Hyperspherical Normalization for Scalable Deep Reinforcement Learning
    Hojoon Lee*, Youngdo Lee*, Takuma Seno, Donghu Kim, Peter Stone, Jaegul Choo

    - arXiv preprint

    + arXiv preprint

    @@ -290,21 +291,31 @@

    Paper

    -
    -
-
-

Citation

+
+

Citation

- If you find our work useful, please consider citing the paper as follows: + If you find our work useful, please consider citing the paper as follows:

-
-
-
+
@article{lee2025simbav2,
+         title={Hyperspherical Normalization for Scalable Deep Reinforcement Learning}, 
+         author={Hojoon Lee and Youngdo Lee and Takuma Seno and Donghu Kim and Peter Stone and Jaegul Choo},
+         journal={arXiv preprint arXiv:2502.15280},
+         year={2025},
+}
+
+