Is there any support for hugging face models for RAG EVAL #2155
-
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
🤖 Hey @mujhenahiata! Good to see you back here. How's your day going? Yes, the Arize Phoenix repository does provide support for evaluating Hugging Face models, specifically Retrieval Augmented Generation (RAG). The repository includes a tutorial that demonstrates how to build a RAG pipeline and evaluate it with Phoenix Evals. The evaluation process includes generating question-context pairs, performing retrieval evaluations, and computing retrieval metrics such as Normalized Discounted Cumulative Gain (NDCG) and precision. The results of the evaluations are then sent to Phoenix for visualization. Here is a code snippet that installs the necessary packages and sets up the environment for the evaluation: !pip install -qq "arize-phoenix[experimental,llama-index]>=2.0"
import nest_asyncio
import os
from getpass import getpass
import pandas as pd
import phoenix as px
from llama_index import SimpleDirectoryReader, VectorStoreIndex, set_global_handler
from llama_index.llms import OpenAI
from llama_index.node_parser import SimpleNodeParser
nest_asyncio.apply()
px.launch_app()
set_global_handler("arize_phoenix") And here is a code snippet that evaluates the RAG system's retrieval performance: from phoenix.experimental.evals import (
RelevanceEvaluator,
run_evals,
)
relevance_evaluator = RelevanceEvaluator(OpenAIModel(model_name="gpt-4-1106-preview"))
retrieved_documents_relevance_df = run_evals(
evaluators=[relevance_evaluator],
dataframe=retrieved_documents_df,
provide_explanation=True,
concurrency=20,
)[0]
# Aggregate the scores across the retrievals
results = rag_evaluation_dataframe.mean(numeric_only=True)
results You can find more details in the evaluate_rag.ipynb tutorial in the Arize Phoenix repository. Please let me know if you need further assistance or have any other questions. SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
@mujhenahiata You can use LiteLLM to select a model from hugging face for Eval as well. See the usage here: and LiteLLM resource here |
Beta Was this translation helpful? Give feedback.
-
Here is a notebook that has an example hugging face model: |
Beta Was this translation helpful? Give feedback.
@mujhenahiata You can use LiteLLM to select a model from hugging face for Eval as well. See the usage here:
https://docs.arize.com/phoenix/api/evaluation-models#phoenix.experimental.evals.litellmmodel
and LiteLLM resource here
https://docs.litellm.ai/docs/providers/huggingface