Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
tanganke committed Oct 30, 2024
1 parent 9902502 commit cd61266
Show file tree
Hide file tree
Showing 6 changed files with 93 additions and 4 deletions.
3 changes: 1 addition & 2 deletions .vscode/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
*
!*.template
*.json
58 changes: 58 additions & 0 deletions .vscode/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# .vscode Folder

This folder contains configuration files for Visual Studio Code to enhance the development experience for the `fusion_bench` project.

## Files

### settings.json.template

This file includes settings for Python testing, search exclusions, and file exclusions.

- **Python Testing**: Configures `unittest` as the testing framework and specifies the test discovery pattern.
- **Search Exclusions**: Excludes certain directories and files from search results.
- **File Exclusions**: Excludes certain directories and files from the file explorer.

### launch.json.template

This file includes configurations for debugging the `fusion_bench.scripts.cli` module, i.e, the `fusion_bench` command-line interface (CLI).

- **Debug Configuration**: Sets up the `debugpy` debugger to launch the `fusion_bench.scripts.cli` module with specific arguments and environment variables.

## Usage

1. **Copy Templates**: Copy `settings.json.template` and `launch.json.template` to `settings.json` and `launch.json` respectively.

```shell
cd .vscode

cp settings.json.template settings.json
cp launch.json.template launch.json
```

2. **Customize**: Modify the copied files as needed to fit your development environment.

For example, you may want to add new debugging configurations for custom experiments.

```json
{
"configurations": [
{
"name": "Custom Experiment",
"type": "debugpy",
"request": "launch",
"module": "fusion_bench.scripts.cli",
"args": [
"--config-name custom_experiment",
"method=method_name",
"method.option_1=value_1",
],
"env": {
"HYDRA_FULL_ERROR": "1",
"CUSTOM_ENV_VAR": "value"
}
}
]
}
```

3. **Open in VS Code**: Open the project in Visual Studio Code to utilize the configurations.
Binary file added docs/algorithms/images/ewemoe.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/algorithms/images/ewemoe_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/algorithms/images/ewemoe_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
36 changes: 34 additions & 2 deletions docs/algorithms/weight_ensembling_moe.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Weight-Ensembling Mixture of Experts (Data-Adaptive Model Merging)

[![arxiv](https://img.shields.io/badge/arXiv-2402.00433-b31b1b.svg)](http://arxiv.org/abs/2402.00433)
[![github](https://img.shields.io/badge/GitHub-Code-181717.svg)](https://github.com/tanganke/weight-ensembling_MoE)

<figure markdown="span">
![alt text](images/wemoe.png){ width="90%" }
<figcaption style="max-width:90%">
Expand Down Expand Up @@ -29,6 +32,34 @@ These task vectors are then added to the pre-trained MLP weights to create input
| AdaMerging | No | No | Yes |
| Ours | No | No | Yes |

## WEMoE V2: E-WEMoE

*L. Shen, A. Tang, E. Yang et al. Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging. Oct, 2024.*[^3]

[![arXiv](https://img.shields.io/badge/arXiv-2410.21804-b31b1b.svg)](http://arxiv.org/abs/2410.21804)
[![github](https://img.shields.io/badge/GitHub-Code-181717.svg)](https://github.com/EnnengYang/Efficient-WEMoE)

<figure markdown="span">
![alt text](images/ewemoe.png){ width="90%" }
<figurecaption>
(a) **Overview of the Efficient Weight-Ensembling Mixture of Experts (E-WEMoE)** Framework. It merges all non-MLP modules through task arithmetic and upgrades the MLP modules into an efficient E-WEMoE module. (b) **E-WEMoE** Module. The module includes a router shared across all Transformer blocks, the pre-trained MLP module, and a set of sparse task-specific vectors w.r.t. MLP modules.
</figurecaption>
</figure>

<figure markdown="span">
![alt text](images/ewemoe_1.png){ width="700px" }
<figurecaption>
Comparison of (a) trainable parameters and (b) total parameters between WEMoE and E-WEMoE-90%.
</figurecaption>
</figure>

<figure markdown="span">
![alt text](images/ewemoe_2.png){ width="1000px" }
<figurecaption>
Comparison of the relationship between parameter count and performance across various model merging methods.
</figurecaption>
</figure>

## Parameters Comparison

!!! tip "Tip for reducing the parameter count"
Expand Down Expand Up @@ -232,5 +263,6 @@ fusion_bench \
::: fusion_bench.method.we_moe.clip_we_moe


[^1]: Anke Tang et.al. ICML 2024. Merging Multi-Task Models via Weight-Ensembling Mixture of Experts. http://arxiv.org/abs/2402.00433
[^2]: Z. Lu, C. Fan, W. Wei, X. Qu, D. Chen, and Y. Cheng, “Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging,” doi: 10.48550/arXiv.2406.15479.
[^1]: Anke Tang et.al. ICML 2024. Merging Multi-Task Models via Weight-Ensembling Mixture of Experts. http://arxiv.org/abs/2402.00433 ICML 2024.
[^2]: Z. Lu, C. Fan, W. Wei, X. Qu, D. Chen, and Y. Cheng, “Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging,” doi: 10.48550/arXiv.2406.15479. NeurIPS 2024.
[^3]: L. Shen, A. Tang, E. Yang et al. Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging. Oct, 2024.

0 comments on commit cd61266

Please sign in to comment.