Skip to content

Simultaneous Evaluation with Multiple LLMs Support

Latest
Compare
Choose a tag to compare
@ritwickbhargav80 ritwickbhargav80 released this 29 Aug 16:10

This release introduces a major update to our Streamlit app, now featuring the capability to perform evaluations with multiple LLMs simultaneously. Users can now select and evaluate multiple models in one go, expanding the flexibility and depth of evaluations.

What's New:
- Multiple LLM Evaluations Simultaneously: Evaluate multiple models in one go, providing a more comprehensive analysis.
- Color Coding: Selected LLM models are now highlighted with background colors corresponding to their providers. In order to differentiate the models from different LLM provides during Multiple LLM Evaluations.

Supported Sources:

  • URL
  • YouTube
  • PDF
  • DOCX

Supported LLMs:

  • Gemini
    • gemini-1.0-pro
    • gemini-pro
    • gemini-1.5-pro-latest
  • OpenAI
    • gpt-3.5-turbo
    • gpt-4
    • gpt-4-turbo
    • gpt-3.5-turbo-16k
  • Azure OpenAI
    • gpt-35-turbo
    • gpt-4
    • gpt-35-turbo-16k
  • Anthropic
    • claude-3-5-sonnet-20240620
    • claude-3-haiku-20240307
    • claude-3-sonnet-20240229
    • claude-3-opus-20240229
  • Groq
    • mixtral-8x7b-32768
    • gemma2-9b-it
    • llama-3.1-8b-instant
    • llama3-70b-8192
    • llama3-8b-8192
    • llama3-groq-70b-8192-tool-use-preview
    • llama3-groq-8b-8192-tool-use-preview
  • Hugging Face
    • huggingfaceh4/zephyr-7b-alpha
    • huggingfaceh4/zephyr-7b-beta

Evaluation Metrics:

  • Context Relevancy
  • Answer Relevancy
  • Groundedness

This release enhances the app's functionality by allowing simultaneous evaluations across multiple LLMs, providing a more comprehensive analysis. Future updates will continue to focus on improving the user experience and adding new features.

Full Changelog: v1.0.0...v1.1.0