Add custom evaluators tutorial and extend evaluation docs#584
Add custom evaluators tutorial and extend evaluation docs#584
Conversation
📝 WalkthroughWalkthroughThis PR adds comprehensive documentation for custom evaluators in AMP Console, extends evaluation concepts with tabbed interfaces and reorganized evaluator categorization, enhances the evaluation monitors tutorial with score breakdown visibility, and updates sidebar navigation and version constants to reflect the v0.9.x release. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
333aab2 to
4fbdbd0
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
website/versioned_docs/version-v0.9.x/concepts/evaluation.mdx (1)
140-148: Consider varying sentence structure for readability.Three consecutive bullet points begin with "Was" (lines 142-144). While the parallel structure is intentional for a list, varying the phrasing slightly could improve flow.
📝 Suggested rewording
-- *Was this LLM response safe and free of harmful content?* -- *Was the tone appropriate for the context?* -- *Was the response coherent and well-structured?* +- *Is this LLM response safe and free of harmful content?* +- *Does the tone fit the context?* +- *Is the response coherent and well-structured?*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@website/versioned_docs/version-v0.9.x/concepts/evaluation.mdx` around lines 140 - 148, The three consecutive bullets beginning "Was this LLM response...", "Was the tone...", and "Was the response..." (in the "Evaluates **each individual LLM call** within the trace." section) should vary phrasing to improve readability; edit the three bullet lines to maintain the same evaluation meaning but change sentence starts (for example: "Is the LLM response safe and free of harmful content?", "Does the tone fit the context?", "Is the response coherent and well-structured?") while preserving parallelism and the final bullet about cost efficiency.website/versioned_docs/version-v0.9.x/tutorials/custom-evaluators.mdx (1)
23-27: Tighten repeated imperative phrasing in the navigation steps.On Line 24–Line 26, three consecutive steps start with “Click,” which reads a bit repetitive. Consider varying one or two verbs (e.g., “Open”, “Select”) for smoother flow.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@website/versioned_docs/version-v0.9.x/tutorials/custom-evaluators.mdx` around lines 23 - 27, Change the repetitive "Click" verbs in the navigation steps to improve flow: replace "Click the **Evaluation** tab" with something like "Open the **Evaluation** tab", change "Click the **Evaluators** sub-tab" to "Select the **Evaluators** sub-tab", and keep or rephrase "Click **Create Evaluator**" to "Create **Evaluator**" or "Click **Create Evaluator**" as preferred so the three consecutive steps no longer all start with "Click"; update the three lines containing those exact phrases in version-v0.9.x/tutorials/custom-evaluators.mdx accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@website/versioned_docs/version-v0.9.x/concepts/evaluation.mdx`:
- Around line 140-148: The three consecutive bullets beginning "Was this LLM
response...", "Was the tone...", and "Was the response..." (in the "Evaluates
**each individual LLM call** within the trace." section) should vary phrasing to
improve readability; edit the three bullet lines to maintain the same evaluation
meaning but change sentence starts (for example: "Is the LLM response safe and
free of harmful content?", "Does the tone fit the context?", "Is the response
coherent and well-structured?") while preserving parallelism and the final
bullet about cost efficiency.
In `@website/versioned_docs/version-v0.9.x/tutorials/custom-evaluators.mdx`:
- Around line 23-27: Change the repetitive "Click" verbs in the navigation steps
to improve flow: replace "Click the **Evaluation** tab" with something like
"Open the **Evaluation** tab", change "Click the **Evaluators** sub-tab" to
"Select the **Evaluators** sub-tab", and keep or rephrase "Click **Create
Evaluator**" to "Create **Evaluator**" or "Click **Create Evaluator**" as
preferred so the three consecutive steps no longer all start with "Click";
update the three lines containing those exact phrases in
version-v0.9.x/tutorials/custom-evaluators.mdx accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b1e93f3b-fa93-4487-a0d4-885fa7b93ba1
⛔ Files ignored due to path filters (9)
website/versioned_docs/version-v0.9.x/img/evaluation/custom-eval-basic-details.pngis excluded by!**/*.pngwebsite/versioned_docs/version-v0.9.x/img/evaluation/custom-eval-code-details.pngis excluded by!**/*.pngwebsite/versioned_docs/version-v0.9.x/img/evaluation/custom-eval-code-editor.pngis excluded by!**/*.pngwebsite/versioned_docs/version-v0.9.x/img/evaluation/custom-eval-list.pngis excluded by!**/*.pngwebsite/versioned_docs/version-v0.9.x/img/evaluation/custom-eval-llm-judge-editor.pngis excluded by!**/*.pngwebsite/versioned_docs/version-v0.9.x/img/evaluation/monitor-dashboard.pngis excluded by!**/*.pngwebsite/versioned_docs/version-v0.9.x/img/evaluation/run-logs.pngis excluded by!**/*.pngwebsite/versioned_docs/version-v0.9.x/img/evaluation/span-scores-tab.pngis excluded by!**/*.pngwebsite/versioned_docs/version-v0.9.x/img/evaluation/traces-table-scores.pngis excluded by!**/*.png
📒 Files selected for processing (5)
website/versioned_docs/version-v0.9.x/_constants.mdwebsite/versioned_docs/version-v0.9.x/concepts/evaluation.mdxwebsite/versioned_docs/version-v0.9.x/tutorials/custom-evaluators.mdxwebsite/versioned_docs/version-v0.9.x/tutorials/evaluation-monitors.mdxwebsite/versioned_sidebars/version-v0.9.x-sidebars.json
✅ Files skipped from review due to trivial changes (3)
- website/versioned_sidebars/version-v0.9.x-sidebars.json
- website/versioned_docs/version-v0.9.x/_constants.md
- website/versioned_docs/version-v0.9.x/tutorials/evaluation-monitors.mdx
Closes #583
Summary
Summary by CodeRabbit