design: Add 0004-multimodal-i2t proposal by sangminwoo · Pull Request #674 · strands-agents/docs

sangminwoo · 2026-03-17T23:53:51Z

Description

Add design doc for multimodal image-to-text evaluation support in strands-evals SDK.

Introduces MultimodalOutputEvaluator extending OutputEvaluator to enable MLLM-as-a-Judge evaluation for multimodal tasks, starting with image/document-to-text. The evaluator constructs multimodal prompts using strands SDK ContentBlock format and supports both reference-free and reference-based evaluation with automatic rubric selection across four dimensions: Overall Quality (P0), Correctness (P0), Faithfulness (P1), and Instruction Following (P1).

Key design decisions:

Extends OutputEvaluator with same Agent.__call__ invocation pattern (accepts both str and list[ContentBlock])
Automatic reference-based rubric selection via _select_rubric() when expected_output is provided
InputT=MultimodalInput (TypedDict) carries {"media": ImageData/AnyMediaData, "instruction": str} (modality-generic naming for future extensibility)
ImageData supports file paths, base64, data URLs, HTTP URLs (auto-fetched via urllib.request), S3 URIs (auto-fetched via boto3), bytes, and PIL Images
Built-in rubric templates + convenience subclasses per dimension

Related Issues

strands-agents/evals Issue #128

Type of Change

New content

Checklist

I have read the CONTRIBUTING document
My changes follow the project's documentation style
I have tested the documentation locally using npm run dev
Links in the documentation are valid and working

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

designs/0004-multimodal-i2t-evaluation.md

…ic selection

sangminwoo · 2026-04-06T22:11:42Z

Hi @afarntrog, this PR is ready for the final review. I've updated the design doc to reflect your comments: 1/ added support for remote URIs/URLs and 2/ broadened the multimodality class to accommodate future media types. Would appreciate an approval when you get a chance so we can get this merged.

design: Add 0004-multimodal-i2t proposal

920914b

sangminwoo requested a deployment to manual-approval March 17, 2026 23:54 — with GitHub Actions Waiting

sangminwoo had a problem deploying to manual-approval March 17, 2026 23:54 — with GitHub Actions Error

sangminwoo marked this pull request as draft March 18, 2026 00:08

sangminwoo marked this pull request as ready for review March 18, 2026 00:08

sangminwoo requested a deployment to manual-approval March 18, 2026 00:12 — with GitHub Actions Waiting

afarntrog reviewed Mar 19, 2026

View reviewed changes

designs/0004-multimodal-i2t-evaluation.md Outdated Show resolved Hide resolved

support for remote image sources

a8b8d7d

sangminwoo requested a deployment to manual-approval March 21, 2026 00:30 — with GitHub Actions Waiting

sangminwoo had a problem deploying to manual-approval March 21, 2026 00:30 — with GitHub Actions Error

Add overall quality evaluator

1c10b8e

sangminwoo had a problem deploying to manual-approval March 23, 2026 22:45 — with GitHub Actions Error

sangminwoo requested a deployment to manual-approval March 23, 2026 22:45 — with GitHub Actions Waiting

Fix correctness/faithfulness scale to binary

0dd14b1

sangminwoo requested a deployment to manual-approval March 27, 2026 00:26 — with GitHub Actions Waiting

sangminwoo had a problem deploying to manual-approval March 27, 2026 00:26 — with GitHub Actions Error

afarntrog previously approved these changes Apr 1, 2026

View reviewed changes

designs/0004-multimodal-i2t-evaluation.md Outdated Show resolved Hide resolved

Update media key naming, Agent invocation pattern, and reference rubr…

d1a2354

…ic selection

sangminwoo dismissed afarntrog’s stale review via d1a2354 April 6, 2026 22:00

sangminwoo requested a deployment to manual-approval April 6, 2026 22:00 — with GitHub Actions Waiting

sangminwoo mentioned this pull request Apr 7, 2026

feat: add multimodal evaluators and prompt templates for image-to-text evaluation strands-agents/evals#187

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design: Add 0004-multimodal-i2t proposal#674

design: Add 0004-multimodal-i2t proposal#674
sangminwoo wants to merge 5 commits intostrands-agents:mainfrom
sangminwoo:main

sangminwoo commented Mar 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

sangminwoo commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sangminwoo commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Type of Change

Checklist

Uh oh!

Uh oh!

Uh oh!

sangminwoo commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sangminwoo commented Mar 17, 2026 •

edited

Loading