Skip to content

πŸ€–πŸ’‘ LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context

Notifications You must be signed in to change notification settings

x66ccff/liveideabench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€–πŸ’‘ LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context

"It's not like finding a needle in a haystack, it is like creating new needles."

πŸ† Leaderboard: http://liveideabench.com πŸ’‘

Dataset

Hugging Face Models

Paper

arXiv

πŸ§ βœ¨πŸŽ‰ News (2025/1/27): Latest Dataset Update on Hugging Face!

We are excited to announce that the latest dataset, including supplementary tests for models like deepseek-R1, deepseek-V3, minimax-01, phi-4, and Opus, has been uploaded to Hugging Face! πŸš€

Check it out here: https://huggingface.co/datasets/6cf/liveideabench-DLC-250127


LiveIdeaBench Evaluation Framework

LiveIdeaBench Evaluation Framework Leaderboard

Bibtex

@article{ruan2024liveideabench,
title={LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context},
author={Kai Ruan and Xuan Wang and Jixiang Hong and Peng Wang and Yang Liu and Hao Sun},
journal={arXiv preprint arXiv:2412.17596},
year={2024}
}