bep bop
Highlights
- Pro
Pinned Loading
-
evals
evals PublicForked from openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Python
-
Crisp-Unimib/ITALIC
Crisp-Unimib/ITALIC PublicITALIC is a benchmark evaluating language models' understanding of Italian culture, commonsense reasoning and linguistic proficiency in a morphologically rich language.
-
-
Crisp-Unimib/Role-Vectors
Crisp-Unimib/Role-Vectors PublicRole Vectors are a novel approach to guiding LLM inference behaviour, an alternative to persona-based prompting.
Python 2
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.

