Implement factual correctness evaluation for RAG workflows

**Description:**
We want to be able to evaluate the factual correctness of lex.llm workflows with lex.eval

**Acceptance criteria:**
- Possible to run factual correctness evaluation based on data from lex.db for a particular workflow

**Technical details:**
Details on factual correctness can be found here: https://docs.ragas.io/en/stable/concepts/metrics/available_metrics/factual_correctness/#factual-correctness
We will likely not be able to use RAGAS directly - but we would probably want to use Pydantic Evals https://ai.pydantic.dev/evals/#pydantic-evals-package
It's likely easiest to set up the system with a CLI to begin with.

**Design:**
_Optional details on design for context._


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement factual correctness evaluation for RAG workflows #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement factual correctness evaluation for RAG workflows #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions