Skip to content

Add image generation to chat UI#2708

Open
vibegui wants to merge 6 commits intomainfrom
vibegui/image-gen-chat
Open

Add image generation to chat UI#2708
vibegui wants to merge 6 commits intomainfrom
vibegui/image-gen-chat

Conversation

@vibegui
Copy link
Contributor

@vibegui vibegui commented Mar 15, 2026

What is this contribution about?

Implements image generation as a first-class feature in the chat UI. Added an "Image" toggle button near the model selector that switches to image generation mode. When active, the model selector filters to only image-capable models (like OpenRouter's Nano Banana 2), aspect ratio controls appear (1:1, 16:9, 9:16), and the user's text becomes an image prompt. Images are generated using the AI SDK's generateImage() with OpenRouter's image models and rendered inline in chat with nice UI.

How to Test

  1. Start dev: bun run dev
  2. Click the image button (Image01 icon) in the chat input action row
  3. Verify the model selector filters to image-capable models only
  4. Select an aspect ratio from the inline picker
  5. Type a prompt like "a cat wearing a top hat" and send
  6. Verify image appears inline with rounded corners and fade-in animation
  7. Hover over the image and click the download button to save
  8. Toggle image mode off and verify controls return to normal
  9. Refresh the page and navigate back to the thread to verify images persist

Migration Notes

No database migrations required. Images are stored as base64 in message threads, same as existing file attachments.

Review Checklist

  • PR title is clear and descriptive
  • Changes are tested and working
  • bun run fmt and bun run check pass
  • No breaking changes
  • Image generation works with OpenRouter models including Nano Banana 2

🤖 Generated with Claude Code


Summary by cubic

Adds image generation to chat via a dedicated image model picker and a generate_image built-in tool. Pick a model and aspect ratio to generate images inline with a download button; the server streams images as file parts with metrics, observability, and safety checks.

  • New Features

    • Image model picker next to the model control; list filters to image-generation models with a new capability icon; works with OpenRouter models.
    • Aspect ratio chips (1:1, 16:9, 9:16); hides file upload while active; clear to return to text-only; selection resets on new/switch thread and isn’t persisted.
    • Assistant renders streamed image file parts inline with fade-in and a hover download button.
    • Server/agents: generate_image calls generateImage() and writes a file part; StreamRequestSchema adds imageModel { id, aspectRatio }; stream-core sends image config and adds a system hint when an image model is selected.
    • Providers/SDK: add image-generation capability to OpenRouter mappings and MODEL_CAPABILITIES for accurate filtering.
  • Bug Fixes

    • Server: guard for image support, validate aspectRatio enum (incl. 4:3, 3:4), allowlist mediaType, add monitorLlmCall, prevent double FINISH, and return a friendly error on failure.
    • UI: disable the image button when no image models, fix Firefox downloads/extension parsing, and trim parenthetical suffix from compact model names.

Written for commit 1b0bcbb. Summary will update on new commits.

@github-actions
Copy link
Contributor

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction Action
👍 Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 15, 2026

Release Options

Should a new version be published when this PR is merged?

React with an emoji to vote on the release type:

Reaction Type Next Version
👍 Prerelease 2.181.4-alpha.1
🎉 Patch 2.181.4
❤️ Minor 2.182.0
🚀 Major 3.0.0

Current version: 2.181.3

Deployment

  • Deploy to production (triggers ArgoCD sync after Docker image is published)

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 11 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/api/routes/decopilot/schemas.ts">

<violation number="1" location="apps/mesh/src/api/routes/decopilot/schemas.ts:91">
P2: Restrict `imageMode.aspectRatio` to the supported ratio values instead of accepting any string.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/input.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/input.tsx:527">
P2: Hiding the upload button in image mode does not disable drag-and-drop uploads, because the editor still mounts `FileUploader`. That leaves image mode accepting files that the backend ignores or rejects.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/select-model.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/select-model.tsx:783">
P1: Filtering the selector in image mode does not enforce an image-capable selected model, so image requests can still be sent with the previously selected text model.</violation>
</file>

<file name="apps/mesh/src/api/routes/decopilot/stream-core.ts">

<violation number="1" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:241">
P1: Validate image-model support on the server before calling `imageModel()`. Otherwise invalid image-mode requests fail at runtime after the message has already been saved.</violation>

<violation number="2" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:318">
P2: Handle aborted image requests before marking the run failed. As written, cancelling image generation is recorded as a failed thread.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 9 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/web/components/chat/image-mode-toggle.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/image-mode-toggle.tsx:51">
P1: Guard enabling image mode until an image-capable model is available; otherwise image requests can be sent with the current text model and fail at runtime.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/store/chat-store.ts">

<violation number="1" location="apps/mesh/src/web/components/chat/store/chat-store.ts:432">
P1: Preserve the current `credentialId` when auto-selecting an image model; otherwise the stored model can lose its connection and later be sent with the wrong key.

(Based on your team's feedback about treating the chat model and credential as an atomic pair.) [FEEDBACK_USED]</violation>

<violation number="2" location="apps/mesh/src/web/components/chat/store/chat-store.ts:438">
P2: Don’t enter image mode unless an image-capable model is available; right now the store can keep the old text model selected and send an invalid image request.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/select-model.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/select-model.tsx:788">
P1: Guard the image-mode model list against connections that have no image-generation models. As written, image mode can be enabled with an empty selector while requests still use the previous non-image model and fail.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 7 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/web/components/chat/store/chat-store.ts">

<violation number="1" location="apps/mesh/src/web/components/chat/store/chat-store.ts:217">
P2: Switching threads while image mode is active loses the previously selected text model. `setActiveThread()` clears `_previousModel` and disables `imageMode` directly, so the temporary image model stays selected in normal chat instead of being restored.</violation>
</file>

<file name="apps/mesh/src/api/routes/decopilot/stream-core.ts">

<violation number="1" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:272">
P2: Move the new `monitorLlmCall` success/error reporting so only the `generateImage()` result determines model success; otherwise write failures can produce contradictory monitoring events.</violation>

<violation number="2" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:291">
P1: Do not relabel unsupported image bytes as `image/png`; reject the type or preserve the original MIME type.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/image-mode-toggle.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/image-mode-toggle.tsx:50">
P3: The new availability check disables the button without applying the disabled styles, so it still looks clickable when no image-capable models are available.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

await saveMessagesToThread(requestMessage);

// ================================================================
// Image generation mode — skip MCP/tool setup, call generateImage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image generation should be an innate tool, not an if block that duplicates existing code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored — image generation is now a generate_image built-in tool (like subtask or sandbox). The ~195-line if-block is gone. The image model is selected via a separate picker and passed to the tool, while the language model handles the agentic loop.

@vibegui vibegui force-pushed the vibegui/image-gen-chat branch from 44b1872 to c2abc8c Compare March 19, 2026 21:15
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 12 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/api/routes/decopilot/stream-core.ts">

<violation number="1" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:330">
P2: Guard the image-generation prompt the same way the tool registration is guarded; otherwise the model can be told to call `generate_image` when the tool is not available for the selected provider.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@vibegui vibegui force-pushed the vibegui/image-gen-chat branch from d275e55 to 3430cc5 Compare March 19, 2026 22:41
vibegui and others added 6 commits March 19, 2026 21:43
Implement image generation in the chat UI using OpenRouter's image models through the AI SDK.
Add an "Image" toggle button that filters the model selector to image-capable models, appears with
aspect ratio picker, and generates images inline in chat with nice UI. Generated images are stored
as base64 in message threads and render with download-on-hover functionality.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Move Image button to right side near model picker for stable positioning,
add "image-generation" capability to distinguish output from input modalities,
auto-select Gemini model when entering image mode, and filter model picker
to only show image generation models.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add server-side capability guard before calling imageModel()
- Validate aspectRatio as enum instead of free-form string
- Allowlist mediaType from provider response (prevent injection)
- Add monitorLlmCall to image path for observability parity
- Add streamFinished guard to prevent double FINISH dispatch
- Reset imageMode on thread switch, clear _previousModel on reset
- Disable Image toggle when no image models available
- Fix Firefox download (append anchor to DOM before click)
- Clean up mediaType extension parsing for downloads
- Revert unrelated conductor.json change
- Add IMAGE-GEN-FOLLOWUPS.md tracking deferred items

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… chat

- Never persist image model to localStorage so refresh always restores text model
- Reset image mode and restore text model on createThread and setActiveThread
- Strip parenthetical suffix from model names in compact trigger display

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of raw "No image generated" error, show a clear message
explaining that image mode is for generating images.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the ~195-line `if (input.imageMode)` block in stream-core.ts with a
`generate_image` built-in tool that runs inside the normal streamText agentic
loop. The image model is now selected via a dedicated picker (separate from
the language model selector), and the tool handles generateImage() calls,
metrics, and error handling internally.

Key changes:
- New `generate_image` built-in tool following subtask/sandbox pattern
- New `ImageModelSelector` component replaces `ImageModeToggle`
- Selecting an image model enables image mode; clearing exits it
- Language model stays selected for streamText; image model is separate
- Remove model save/restore (_previousModel) logic from chat store
- Remove imageMode filtering from text model selector

Addresses PR review feedback from @pedrofrxncx.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vibegui vibegui force-pushed the vibegui/image-gen-chat branch from 3430cc5 to 1b0bcbb Compare March 20, 2026 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants