Skip to content
/ CoMAS Public

Implementation for the paper "CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards".

License

Notifications You must be signed in to change notification settings

xxyQwQ/CoMAS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards

Implementation for the paper "CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards".

arXiv Paper GitHub Code

Teaser

Self-evolution is a central research topic in enabling large language model (LLM)-based agents to continually improve their capabilities after pretraining. Recent research has witnessed a transition from reinforcement learning (RL)-free to RL-based methods. Inspired by the self-evolution mechanisms observed in human intelligence, we introduce Co-Evolving Multi-Agent Systems (CoMAS), a novel framework that enables agents to improve autonomously by learning from inter-agent interactions without external supervision. CoMAS generates intrinsic rewards from rich discussion dynamics, employs an LLM-as-a-judge mechanism to formulate these rewards, and optimizes each agent's policy through RL, thereby enabling decentralized and scalable co-evolution.

📰 News

  • [2026/01/26] CoMAS is accepted by ICLR 2026.
  • [2025/10/10] Our code implementation is released on GitHub.
  • [2025/10/10] The initial version of our paper is submitted to arXiv.

⚙️ Configuration

Clone the repository and navigate to the directory:

git clone https://github.com/xxyQwQ/CoMAS
cd CoMAS

Create a conda environment and install the dependencies:

conda create -n comas python=3.10
conda activate comas
pip install -r requirements.txt

Feel free to adjust the version of dependencies if necessary.

🚀 Training

Run the following commands to conduct CoMAS training:

conda activate comas
bash scripts/train_comas.sh

The training logs, intermediate checkpoints, together with saved models will be saved in the saves folder.

Before training, you should carefully configure the parameters.

  • It is necessary to modify the model paths and wandb settings.
  • For A100 GPUs with 80GB memory, it is recommended to use 2 GPUs for each agent.
  • Choose an appropriate micro batch size to avoid out-of-memory errors.

📊 Evaluation

Run the following commands to evaluate the model:

conda activate comas
bash scripts/evaluate_model.sh [model_name] [model_path]

The evaluation results will be saved in the results folder.

Remember to replace [model_name] and [model_path] with your own settings.

📜 Citation

Please consider citing our paper if you find it helpful:

@article{xue2025comas,
  title={CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards},
  author={Xue, Xiangyuan and Zhou, Yifan and Zhang, Guibin and Zhang, Zaibin and Li, Yijiang and Zhang, Chen and Yin, Zhenfei and Torr, Philip and Ouyang, Wanli and Bai, Lei},
  journal={arXiv preprint arXiv:2510.08529},
  year={2025}
}

About

Implementation for the paper "CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards".

Resources

License

Stars

Watchers

Forks