Skip to content

HKUST-KnowComp/MASLegalBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MASLegalBench: Benchmarking Multi-Agent Systems in Deductive Legal Reasoning

Huihao Jing😎 1, Wenbin Hu😎 1, Hongyu Luo1, Jianhui Yang2, Wei Fan1, Haoran Li🤗 1, Yangqiu Song1

😎Equal contribution. 🤗Corresponding author.

1Hong Kong University of Science and Technology
2Tsinghua University

🤩 Abstract

Multi-agent systems (MAS), leveraging the remarkable capabilities of Large Language Models (LLMs), show great potential in addressing complex tasks. In this context, integrating MAS with legal tasks is a crucial step. While previous studies have developed legal benchmarks for LLM agents, none are specifically designed to consider the unique advantages of MAS, such as task decomposition, agent specialization, and flexible training. In fact, the lack of evaluation methods limits the potential of MAS in the legal domain. To address this gap, we propose MASLegalBench, a legal benchmark tailored for MAS and designed with a deductive reasoning approach. Our benchmark uses GDPR as the application scenario, encompassing extensive background knowledge and covering complex reasoning processes that effectively reflect the intricacies of real-world legal situations. Furthermore, we manually design various role-based MAS and conduct extensive experiments using different state-of-the-art LLMs. Our results highlight the strengths, limitations, and potential areas for improvement of existing models and MAS architectures.

😎 Quick Start: Try Our MCIP Guardian Model

bash scripts/bm25/eval_qwen25.sh
bash scripts/bm25/eval_qwen3_8b.sh
...

Miscellaneous

Please send any questions about the code and/or the method to hjingaa@connect.ust.hk

███╗   ███╗ █████╗ ███████╗██╗      ███████╗ ██████╗  █████╗ ██╗     ██████╗  ███████╗███╗   ██╗ ██████╗██╗   ██╗
████╗ ████║██╔══██╗██╔════╝██║      ██╔════╝██╔════╝ ██╔══██╗██║     ██╔══██╗ ██╔════╝████╗  ██║██╔════╝██║   ██║
██╔████╔██║███████║███████╗██║      █████╗  ██║  ███╗███████║██║     ██████╔╝ █████╗  ██╔██╗ ██║██║     ████████║
██║╚██╔╝██║██╔══██║╚════██║██║      ██╔══╝  ██║   ██║██╔══██║██║     ██╔══██╗ ██╔══╝  ██║╚██╗██║██║     ██╔═══██║
██║ ╚═╝ ██║██║  ██║███████║███████╗ ███████╗╚██████╔╝██║  ██║███████╗██████╔╝ ███████╗██║ ╚████║╚██████╗██║   ██║
╚═╝     ╚═╝╚═╝  ╚═╝╚══════╝╚══════╝ ╚══════╝ ╚═════╝ ╚═╝  ╚═╝╚══════╝╚═════╝  ╚══════╝╚═╝  ╚═══╝ ╚═════╝╚═╝   ╚═╝

About

Official Repository for MASLegalBench.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published