-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Pull requests: EleutherAI/lm-evaluation-harness
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add --include_package support for external model modules
#3672
opened Apr 3, 2026 by
morgenyu
Loading…
fix: local directory with task name no longer shadows registered task
#3670
opened Apr 1, 2026 by
nloughl
Loading…
Add SimpleQA (OpenAI factuality benchmark)
#3667
opened Mar 31, 2026 by
sarthakkgupta
Loading…
6 tasks done
AIFQA-393 BLK-007: [vLLM/lm_eval] ChatGLM2/3 tokenizer — empty stop string crash in lm_eval
#3665
opened Mar 30, 2026 by
jklawikowski
Loading…
Add InfiniteBench: long-context evaluation beyond 100K tokens
#3662
opened Mar 29, 2026 by
siddhant-rajhans
Loading…
fix: fall back to tokenizer.eos_token when decode returns empty string
#3657
opened Mar 27, 2026 by
ganeshr10
Loading…
[BUGFIX] Consistent handling of None answers and cache
#3656
opened Mar 26, 2026 by
RawthiL
Loading…
feat: add optional SymPy equivalence and math_verify to hendrycks_math
#3655
opened Mar 26, 2026 by
NezLheimeur
Loading…
fix: Reset batch_sizes cache before each _loglikelihood_tokens call
#3654
opened Mar 26, 2026 by
nevertmr
Loading…
[Hendrycks] Fix false negatives and add
flexible_match to Hendrycks
#3653
opened Mar 25, 2026 by
fxmarty-amd
Loading…
fix: add Phi-3.5-vision support for vllm-vlm model type
#3651
opened Mar 24, 2026 by
ganeshr10
Loading…
fix: migrate pubmedqa to script-less qiaojin/pubmed_qa dataset #3645
#3649
opened Mar 23, 2026 by
Ishitha-P
Loading…
Fix chat_template_args handling when enable_thinking is None
#3648
opened Mar 21, 2026 by
ranjita-naik
Loading…
fix: zeno_visualize nested output directory discovery
#3647
opened Mar 21, 2026 by
komaksym
Loading…
fix: extract \boxed{} from model response in hendrycks_math
#3644
opened Mar 20, 2026 by
NezLheimeur
Loading…
Fix ruff lint failures in models/__init__.py and huggingface.py
#3641
opened Mar 20, 2026 by
dzautner
Loading…
1 task
fix: preserve chat_template_args when enable_thinking is None
#3640
opened Mar 19, 2026 by
NezLheimeur
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-03-31.