Release Simultaneous Evaluation with Multiple LLMs Support · ritwickbhargav80/quick-llm-model-evaluations

This release introduces a major update to our Streamlit app, now featuring the capability to perform evaluations with multiple LLMs simultaneously. Users can now select and evaluate multiple models in one go, expanding the flexibility and depth of evaluations.

What's New:
- Multiple LLM Evaluations Simultaneously: Evaluate multiple models in one go, providing a more comprehensive analysis.
- Color Coding: Selected LLM models are now highlighted with background colors corresponding to their providers. In order to differentiate the models from different LLM provides during Multiple LLM Evaluations.

Supported Sources:

URL
YouTube
PDF
DOCX

Supported LLMs:

Gemini
- gemini-1.0-pro
- gemini-pro
- gemini-1.5-pro-latest
OpenAI
- gpt-3.5-turbo
- gpt-4
- gpt-4-turbo
- gpt-3.5-turbo-16k
Azure OpenAI
- gpt-35-turbo
- gpt-4
- gpt-35-turbo-16k
Anthropic
- claude-3-5-sonnet-20240620
- claude-3-haiku-20240307
- claude-3-sonnet-20240229
- claude-3-opus-20240229
Groq
- mixtral-8x7b-32768
- gemma2-9b-it
- llama-3.1-8b-instant
- llama3-70b-8192
- llama3-8b-8192
- llama3-groq-70b-8192-tool-use-preview
- llama3-groq-8b-8192-tool-use-preview
Hugging Face
- huggingfaceh4/zephyr-7b-alpha
- huggingfaceh4/zephyr-7b-beta

Evaluation Metrics:

Context Relevancy
Answer Relevancy
Groundedness

This release enhances the app's functionality by allowing simultaneous evaluations across multiple LLMs, providing a more comprehensive analysis. Future updates will continue to focus on improving the user experience and adding new features.

Full Changelog: v1.0.0...v1.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simultaneous Evaluation with Multiple LLMs Support