update docs

tanganke · Oct 30, 2024 · cd61266 · cd61266
1 parent 9902502
commit cd61266
Show file tree

Hide file tree

Showing 6 changed files with 93 additions and 4 deletions.
diff --git a/.vscode/.gitignore b/.vscode/.gitignore
@@ -1,2 +1 @@
-*
-!*.template
+*.json
diff --git a/.vscode/README.md b/.vscode/README.md
@@ -0,0 +1,58 @@
+# .vscode Folder
+
+This folder contains configuration files for Visual Studio Code to enhance the development experience for the `fusion_bench` project.
+
+## Files
+
+### settings.json.template
+
+This file includes settings for Python testing, search exclusions, and file exclusions.
+
+- **Python Testing**: Configures `unittest` as the testing framework and specifies the test discovery pattern.
+- **Search Exclusions**: Excludes certain directories and files from search results.
+- **File Exclusions**: Excludes certain directories and files from the file explorer.
+
+### launch.json.template
+
+This file includes configurations for debugging the `fusion_bench.scripts.cli` module, i.e, the `fusion_bench` command-line interface (CLI).
+
+- **Debug Configuration**: Sets up the `debugpy` debugger to launch the `fusion_bench.scripts.cli` module with specific arguments and environment variables.
+
+## Usage
+
+1. **Copy Templates**: Copy `settings.json.template` and `launch.json.template` to `settings.json` and `launch.json` respectively.
+
+    ```shell
+    cd .vscode
+
+    cp settings.json.template settings.json
+    cp launch.json.template launch.json
+    ```
+
+2. **Customize**: Modify the copied files as needed to fit your development environment.
+
+    For example, you may want to add new debugging configurations for custom experiments.
+
+    ```json
+    {
+        "configurations": [
+            {
+                "name": "Custom Experiment",
+                "type": "debugpy",
+                "request": "launch",
+                "module": "fusion_bench.scripts.cli",
+                "args": [
+                    "--config-name custom_experiment",
+                    "method=method_name",
+                    "method.option_1=value_1",
+                ],
+                "env": {
+                    "HYDRA_FULL_ERROR": "1",
+                    "CUSTOM_ENV_VAR": "value"
+                }
+            }
+        ]
+    }
+    ```
+
+3. **Open in VS Code**: Open the project in Visual Studio Code to utilize the configurations.
diff --git a/docs/algorithms/images/ewemoe.png b/docs/algorithms/images/ewemoe.png
diff --git a/docs/algorithms/images/ewemoe_1.png b/docs/algorithms/images/ewemoe_1.png
diff --git a/docs/algorithms/images/ewemoe_2.png b/docs/algorithms/images/ewemoe_2.png
diff --git a/docs/algorithms/weight_ensembling_moe.md b/docs/algorithms/weight_ensembling_moe.md
@@ -1,5 +1,8 @@
 # Weight-Ensembling Mixture of Experts (Data-Adaptive Model Merging)
 
+[![arxiv](https://img.shields.io/badge/arXiv-2402.00433-b31b1b.svg)](http://arxiv.org/abs/2402.00433)
+[![github](https://img.shields.io/badge/GitHub-Code-181717.svg)](https://github.com/tanganke/weight-ensembling_MoE)
+
 <figure markdown="span">
     ![alt text](images/wemoe.png){ width="90%" }
     <figcaption style="max-width:90%">
@@ -29,6 +32,34 @@ These task vectors are then added to the pre-trained MLP weights to create input
 | AdaMerging      | No                                       | No                                  | Yes                  |
 | Ours            | No                                       | No                                  | Yes                  |
 
+## WEMoE V2: E-WEMoE
+
+*L. Shen, A. Tang, E. Yang et al. Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging. Oct, 2024.*[^3]
+
+[![arXiv](https://img.shields.io/badge/arXiv-2410.21804-b31b1b.svg)](http://arxiv.org/abs/2410.21804)
+[![github](https://img.shields.io/badge/GitHub-Code-181717.svg)](https://github.com/EnnengYang/Efficient-WEMoE)
+
+<figure markdown="span">
+  ![alt text](images/ewemoe.png){ width="90%" }
+  <figurecaption>
+  (a) **Overview of the Efficient Weight-Ensembling Mixture of Experts (E-WEMoE)** Framework. It merges all non-MLP modules through task arithmetic and upgrades the MLP modules into an efficient E-WEMoE module. (b) **E-WEMoE** Module. The module includes a router shared across all Transformer blocks, the pre-trained MLP module, and a set of sparse task-specific vectors w.r.t. MLP modules.
+  </figurecaption>
+</figure>
+
+<figure markdown="span">
+  ![alt text](images/ewemoe_1.png){ width="700px" }
+  <figurecaption>
+    Comparison of (a) trainable parameters and (b) total parameters between WEMoE and E-WEMoE-90%.
+  </figurecaption>
+</figure>
+
+<figure markdown="span">
+  ![alt text](images/ewemoe_2.png){ width="1000px" }
+  <figurecaption>
+    Comparison of the relationship between parameter count and performance across various model merging methods.
+  </figurecaption>
+</figure>
+
 ## Parameters Comparison
 
 !!! tip "Tip for reducing the parameter count"
@@ -232,5 +263,6 @@ fusion_bench \
 ::: fusion_bench.method.we_moe.clip_we_moe
 
 
-[^1]: Anke Tang et.al. ICML 2024. Merging Multi-Task Models via Weight-Ensembling Mixture of Experts. http://arxiv.org/abs/2402.00433
-[^2]: Z. Lu, C. Fan, W. Wei, X. Qu, D. Chen, and Y. Cheng, “Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging,” doi: 10.48550/arXiv.2406.15479.
+[^1]: Anke Tang et.al. ICML 2024. Merging Multi-Task Models via Weight-Ensembling Mixture of Experts. http://arxiv.org/abs/2402.00433 ICML 2024.
+[^2]: Z. Lu, C. Fan, W. Wei, X. Qu, D. Chen, and Y. Cheng, “Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging,” doi: 10.48550/arXiv.2406.15479. NeurIPS 2024.
+[^3]: L. Shen, A. Tang, E. Yang et al. Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging. Oct, 2024.