Set up ESLint for the repo and fix lint issues #282

tyom · 2025-10-29T19:15:54Z

Originally proposed in #276. Resubmitting against v1 branch as requested.

Include ESLint 9 as root dependency
Set up ESLint to lint the whole repo
Extend the root config and add a few package-specific plugins for Evalite UI
Add a consistent typecheck npm script for type checking across the repo

Use can now use pnpm lint in root and UI app and pnpm typecheck anywhere in the repo.
Use pnpm lint --fix to attempt to fix the issues.

Fix errors (mostly removing unused imports and variables). I left some prefixed with _ to serve as a reminder of function parameters.

Also add EditorConfig file to help maintain consistent code style for those who use it.

- Remove unused DB_LOCATION import from test-utils.ts - Replace FILES_LOCATION import with local constant in files.test.ts Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>

- Add dotenv as a dependency - Create env-setup-file module that imports dotenv/config - Export env-setup-file as 'evalite/env-setup-file' - Automatically prepend env-setup-file to setupFiles array - Update documentation to reflect automatic .env loading - Update example config to remove manual dotenv setup Fixes mattpocock#234 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>

… precedence - Add loadVitestSetupFiles() to load setupFiles from vitest.config.ts - Merge setupFiles from both configs with evalite.config.ts taking precedence - Add tests for vitest.config.ts setupFiles support and precedence - setupFiles execution order: env-setup-file -> vitest -> evalite Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>

- Export new `evalite/scorers` module with factory functions - Add `createLLMBasedScorer` for model-dependent scorers - Add `createEmbeddingBasedScorer` for embedding-dependent scorers - Introduce `EvaluationSample` type with query, contexts, and reference fields Part of mattpocock#250

- Added a new `faithfulness` scorer to evaluate model responses against retrieved contexts. - Introduced utility functions for scoring and context handling. - Updated `package.json` to include `zod` version 4.1.12 as a dependency. - Updated `pnpm-lock.yaml` to reflect changes in dependencies and versions. Part of mattpocock#250

Add Faithfulness

- Introduced a new `answerSimilarity` scorer to assess the semantic similarity between a ground truth answer and a generated answer. - The scorer utilizes embedding models to compute cosine similarity and includes an optional threshold for binary output. - Updated the `scorers` module to export the new `answerSimilarity` scorer. Part of mattpocock#250

Add Answer Similarity Scorer

- Introduced a new `contextRecall` scorer to evaluate how much of a generated answer can be attributed to retrieved contexts. - Updated the `scorers` module to export the new `contextRecall` scorer. Part of mattpocock#250

…endency

…c and updating metadata format

…g based scorers, and context recall and faithfulness classifications

… namespace for better organization

…d Multi Turn Sample Type

…intained and grows as necessary

Add Scorers module

feat: Swap from React Markdown to Streamdown

…splay issues Fixes mattpocock#265 - durations were displaying with full floating point precision like '3.090249999999997ms' instead of rounding to '3ms'. Updated formatTime() to use Math.round() for millisecond values. Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>

…251024-1549 fix: round millisecond durations to avoid floating point precision display issues

* refactor: Simplify scorer factory API - Remove `createBaseScorer`, consolidate to `createLLMScorer`/`createEmbeddingScorer` - Add generic `TExpected` type for type-safe expected data - Replace `singleTurn`/`multiTurn` with single `scorer` function - Rename utils to `isSingleTurnInput`/`isMultiTurnInput` - Update all scorers (faithfulness, answerSimilarity, contextRecall) to new API - Fix example.eval.ts: textStream -> text 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Move inline scorer types to Evalite.Scorers namespace Move inline expected data types from answer-similarity, context-recall, and faithfulness scorers to the Evalite.Scorers namespace in types.ts for better type organization and discoverability. Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com> * Formatting --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>

* refactor: Simplify scorer factory API - Remove `createBaseScorer`, consolidate to `createLLMScorer`/`createEmbeddingScorer` - Add generic `TExpected` type for type-safe expected data - Replace `singleTurn`/`multiTurn` with single `scorer` function - Rename utils to `isSingleTurnInput`/`isMultiTurnInput` - Update all scorers (faithfulness, answerSimilarity, contextRecall) to new API - Fix example.eval.ts: textStream -> text 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Move inline scorer types to Evalite.Scorers namespace Move inline expected data types from answer-similarity, context-recall, and faithfulness scorers to the Evalite.Scorers namespace in types.ts for better type organization and discoverability. Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com> * refactor: Update scorer types and utility functions for output handling - Renamed input types in Evalite.Scorers namespace to reflect output handling: SingleTurnInput to SingleTurnOutput, MultiTurnInput to MultiTurnOutput, and updated related types accordingly. - Modified scorer implementations in context-recall and faithfulness to use new output types. - Updated utility functions to check for output types instead of input types, enhancing clarity and consistency in the scoring logic. * Formatting * Trigger CI re-check --------- Co-authored-by: Matt Pocock <mattpocockvoice@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>

* feat: update tailwind.css with new dark theme colors Solves: mattpocock#272 * fix: remove unnecessary class from active sidebar item styling

* feat: integrate search functionality - Implemented search functionality in the main application layout, allowing users to filter evaluations based on search queries. - Updated routes to support search parameters using Zod for validation. Closes: mattpocock#271 * Create wet-clocks-camp.md --------- Co-authored-by: Matt Pocock <mattpocockvoice@gmail.com>

* refactor: Simplify scorer factory API - Remove `createBaseScorer`, consolidate to `createLLMScorer`/`createEmbeddingScorer` - Add generic `TExpected` type for type-safe expected data - Replace `singleTurn`/`multiTurn` with single `scorer` function - Rename utils to `isSingleTurnInput`/`isMultiTurnInput` - Update all scorers (faithfulness, answerSimilarity, contextRecall) to new API - Fix example.eval.ts: textStream -> text 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Move inline scorer types to Evalite.Scorers namespace Move inline expected data types from answer-similarity, context-recall, and faithfulness scorers to the Evalite.Scorers namespace in types.ts for better type organization and discoverability. Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com> * refactor: Update scorer types and utility functions for output handling - Renamed input types in Evalite.Scorers namespace to reflect output handling: SingleTurnInput to SingleTurnOutput, MultiTurnInput to MultiTurnOutput, and updated related types accordingly. - Modified scorer implementations in context-recall and faithfulness to use new output types. - Updated utility functions to check for output types instead of input types, enhancing clarity and consistency in the scoring logic. * feat: Implement Tool Call Accuracy scorer Related mattpocock#250 --------- Co-authored-by: Matt Pocock <mattpocockvoice@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>

- Extended loadFixture() to return Vitest instance with Symbol.asyncDispose - Added triggerWatchModeRerun() helper using vitest.waitForTestRunEnd() - Added disableServer option to runEvalite() to prevent port conflicts - runEvalite() now returns Vitest instance - Fixed ai-sdk-traces.test.ts to use await using 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This fixes the TS errors.

- Include ESLint 9 as root dependency - Set up ESLint to lint the whole repo - Extend the root config and add a few package-specific plugins for Evalite UI - Add a consistent `typecheck` npm script for type checking across the repo Use can now use `pnpm lint` in root and UI app and `pnpm typecheck` anywhere in the repo. Use `pnpm lint --fix` to attempt to fix the issues.

changeset-bot · 2025-10-29T19:15:58Z

⚠️ No Changeset found

Latest commit: 205a215

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2025-10-29T19:16:00Z

@tyom is attempting to deploy a commit to the Skill Recordings Team on Vercel.

A member of the Team first needs to authorize it.

mattpocock and others added 30 commits October 19, 2025 12:44

Changed default storage to in-memory. SQLite still available via config.

da895ea

Remove problematic backend-only-constants imports

2efa48e

- Remove unused DB_LOCATION import from test-utils.ts - Replace FILES_LOCATION import with local constant in files.test.ts Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>

Fixed CI properly

59677dd

Merge branch 'main' of https://github.com/mattpocock/evalite into v1

9514115

Huge move from evals -> suites, and results -> evals

48a5581

Added changeset

26073ea

Removed streaming text support from tasks.

54c9ccb

Fixes after cherrypick

4f8ec7a

Formatting

c624aca

Docs updates

d135997

Docs updates

15f3dc2

Merge pull request mattpocock#1 from cantemizyurek/faithfulness

b48f5a7

Add Faithfulness

feat: Add evaluation script for Answer Similarity

b6bac1c

Merge pull request mattpocock#2 from cantemizyurek/answer-similarity

eafa00c

Add Answer Similarity Scorer

feat: Add Context Recall Scorer

4f788c3

- Introduced a new `contextRecall` scorer to evaluate how much of a generated answer can be attributed to retrieved contexts. - Updated the `scorers` module to export the new `contextRecall` scorer. Part of mattpocock#250

feat: Add evaluation script for RAG Context Recall

e6448d2

refactor: Update scorers to use 'expected' instead of 'input.reference'

652591e

refactor: Remove failedToScore utility and replace with error in scorers

156a06d

refactor: Update scoring schemas to use jsonSchema and remove zod dep…

a719796

…endency

refactor: Simplify answerSimilarity scorer by removing threshold logi…

efdffa4

…c and updating metadata format

refactor: rename embedding to embeddingModel clearer

189ef3e

refactor: update embedding property to embeddingModel for clarity

d5f243d

refactor: Introduce Scorers namespace with types for LLM and embeddin…

d031484

…g based scorers, and context recall and faithfulness classifications

refactor: Move SingleTurnSample and EvaluationSample types to Scorers…

68bcde5

… namespace for better organization

refactor: Update Evalite types to support userInput structure. And ad…

3535516

…d Multi Turn Sample Type

mattpocock and others added 23 commits October 23, 2025 10:32

Docs updates

0519544

Docs updates

43dbbd8

feat: Add sheet overlay backdrop for evaluation routes

53b2cd1

fix: Update layout for ResultComponent to ensure minimum height is ma…

563791d

…intained and grows as necessary

Create real-phones-join.md

df9484b

Merge branch 'v1' of https://github.com/mattpocock/evalite into scorers

a24889f

Merge pull request mattpocock#251 from cantemizyurek/scorers

f67f215

Add Scorers module

refactor: change codeblocks theme to dark+ and light+

ef66bc9

Merge pull request mattpocock#257 from cantemizyurek/swap-reactmarkdown

adfb2a6

feat: Swap from React Markdown to Streamdown

Merge pull request mattpocock#267 from mattpocock/claude/issue-265-20…

02fdd76

…251024-1549 fix: round millisecond durations to avoid floating point precision display issues

Enhance dark theme (mattpocock#274)

403481b

* feat: update tailwind.css with new dark theme colors Solves: mattpocock#272 * fix: remove unnecessary class from active sidebar item styling

Merge branch 'main' of https://github.com/mattpocock/evalite into v1

dfdb619

Add .editorconfig file

0d16521

Return vitest instance when returning with !shouldKeepRunning

b42641a

This fixes the TS errors.

Add missing break in switch case

e99fc1c

Fix ESLint issues

205a215

tyom mentioned this pull request Oct 29, 2025

Set up ESLint for the repo and fix lint issues #276

Closed

mattpocock force-pushed the v1 branch 4 times, most recently from a5f098c to 8c4667c Compare November 8, 2025 13:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Set up ESLint for the repo and fix lint issues #282

Set up ESLint for the repo and fix lint issues #282

Uh oh!

tyom commented Oct 29, 2025

Uh oh!

changeset-bot bot commented Oct 29, 2025

Uh oh!

vercel bot commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Set up ESLint for the repo and fix lint issues #282

Are you sure you want to change the base?

Set up ESLint for the repo and fix lint issues #282

Uh oh!

Conversation

tyom commented Oct 29, 2025

Uh oh!

changeset-bot bot commented Oct 29, 2025

⚠️ No Changeset found

Uh oh!

vercel bot commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants