Skip to content
/ GISA Public

GISA: A Benchmark for General Information-Seeking Assistant

License

Notifications You must be signed in to change notification settings

RUC-NLPIR/GISA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GISA: A Benchmark for General Information-Seeking Assistant

license

Authors: Yutao Zhu, Xingshuo Zhang, Maosen Zhang, Jiajie Jin, Liancheng Zhang, Xiaoshuai Song, Kangzhi Zhao, Wencong Zeng, Ruiming Tang, Han Li, Ji-Rong Wen, and Zhicheng Dou

Benchmark Highlights

GISA is a benchmark for General Information-Seeking Assistants with 373 human-crafted queries that reflect real-world information needs. It includes both stable and live subsets, four structured answer formats (item, set, list, table), and complete human search trajectories for every query.

  • Diverse answer formats with deterministic evaluation.
    GISA uses four structured answer types (item, set, list, table) with strict matching metrics for reproducible evaluation, avoiding subjective LLM judging while preserving task diversity.
  • Unified deep + wide search capabilities.
    Tasks require both vertical reasoning and horizontal information aggregation across sources, evaluating long-horizon exploration and summarization in one benchmark.
  • Dynamic, anti-static evaluation.
    Queries are split into stable and live subsets; the live subset is periodically updated to reduce memorization and keep the benchmark challenging over time.
  • Process-level supervision via human trajectories.
    Full human search trajectories are provided for every query, serving as gold references for process reward modeling and imitation learning while validating task solvability.

Evaluation

Please follow the instruction in eval_script for evaluation.

Submission

Please send your results to yutaozhu94 AT gmail.com or use HuggingFace leaderboard submission system. We will merge approved results periodically.

Citation

@article{GISA,
  title     = {GISA: A Benchmark for General Information Seeking Assistant},
  author    = {Yutao Zhu and
               Xingshuo Zhang and
               Maosen Zhang and
               Jiajie Jin and
               Liancheng Zhang and
               Xiaoshuai Song and
               Kangzhi Zhao and
               Wencong Zeng and
               Ruiming Tang and
               Han Li and
               Ji-Rong Wen and
               Zhicheng Dou},
  journal    = {CoRR},
  volume     = {abs/2602.08543},
  year       = {2026},
  url        = {https://doi.org/10.48550/arXiv.2602.08543},
  doi        = {10.48550/ARXIV.2602.08543},
  eprinttype = {arXiv},
  eprint     = {2602.08543}
}