Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
alopatenko authored Nov 17, 2024
1 parent ebf07f9 commit b6874b7
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ My view on LLM Evaluation: [Deck](LLMEvaluation.pdf), and [SF Big Analytics and
---
### Evaluation Software
- [EleutherAI LLM Evaluation Harness ](https://github.com/EleutherAI/lm-evaluation-harness)
- Eureka, Microsoft, A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. [github](https://github.com/microsoft/eureka-ml-insights) Sep 2024 [arxiv](https://arxiv.org/abs/2409.10566)
- [OpenAI Evals]( https://github.com/openai/evals)
- [ConfidentAI DeepEval](https://github.com/confident-ai/deepeval)
- [MTEB](https://huggingface.co/spaces/mteb/leaderboard)
Expand Down

0 comments on commit b6874b7

Please sign in to comment.