Skip to content

Extract video URLs and thumbnails from proposal iframes#28

Merged
microstudi merged 10 commits intomainfrom
extract-video
Feb 25, 2026
Merged

Extract video URLs and thumbnails from proposal iframes#28
microstudi merged 10 commits intomainfrom
extract-video

Conversation

@microstudi
Copy link
Member

@microstudi microstudi commented Feb 25, 2026

Summary

This PR implements issue #16 - Use embedded video iframes in proposal details (WhatsApp interactive reply).

Changes

1. Video URL & Thumbnail Extraction

  • Created VideoEmbedExtractor service class to parse iframe URLs from proposal bodies
  • Supports YouTube and Vimeo video embeds
  • Extracts video watch URLs and thumbnail images
  • YouTube thumbnails use the public image CDN (no API required)
  • Vimeo thumbnail extraction noted for future enhancement (requires API call)

2. Updated ProposalsWorkflow

  • Detects video iframes in proposal bodies
  • Displays video URL as clickable link with 🎥 emoji in message body
  • Uses video thumbnail as header image (when available)
  • Falls back to proposal photo if no thumbnail available
  • Dynamic body length calculation to ensure messages stay within WhatsApp's 1024 character limit

3. Comprehensive Tests

  • Unit tests for VideoEmbedExtractor covering YouTube, Vimeo, and edge cases
  • Workflow tests for proposals with/without video iframes
  • Tests verify thumbnail URLs and body text formatting
  • Tests confirm 1024 character limit is respected

Technical Notes

  • Video URLs are displayed as clickable links in the message body (WhatsApp will auto-link them)
  • Cannot send YouTube/Vimeo URLs directly as video headers (WhatsApp requires .mp4/.3gpp files)
  • Thumbnail extraction is legal and uses public APIs
  • Body length calculation accounts for title, video URL, and proposal URL lengths

Testing

All tests pass and cover:

  • ✅ YouTube iframe extraction and thumbnail URLs
  • ✅ Vimeo iframe extraction (thumbnail fallback)
  • ✅ Multiple video formats (standard embed, short URLs, with query params)
  • ✅ Proposals without videos maintain existing behavior
  • ✅ 1024 character body limit enforcement

Closes #16

Summary by CodeRabbit

  • New Features

    • Proposals now detect embedded YouTube/Vimeo videos and show video URLs and thumbnails in carousel cards and details; body text is dynamically truncated to fit limits.
    • WhatsApp messages now allow video headers (falling back to image or text) and use a standardized footer text format.
  • Tests

    • Added comprehensive tests for video extraction and message formatting covering YouTube, Vimeo, and mixed content.

Copilot AI review requested due to automatic review settings February 25, 2026 18:26
@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

📝 Walkthrough

Walkthrough

Adds a VideoEmbedExtractor service to parse iframe HTML for YouTube and Vimeo, integrates video detection into the proposals workflow to include video URLs and thumbnails in WhatsApp messages, updates WhatsApp interactive envelope header handling to support video, and adds tests for extractor and workflow behaviors.

Changes

Cohort / File(s) Summary
Video Extraction Infrastructure
app/services/decidim/chatbot/media/video_embed_extractor.rb
New service that parses iframe HTML to extract canonical YouTube and Vimeo URLs, exposes url, valid?, and thumbnail_url; YouTube thumbnail generation implemented, Vimeo thumbnail left nil (API required).
WhatsApp Envelope Enhancement
app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb
Header construction now prioritizes video → image → text and sets header[:video] = { link: ... } when present; footer wrapped as { text: ... }; cleanup deletes interactive[:header] when header content is blank.
Proposals Workflow Integration
app/services/decidim/chatbot/workflows/proposals_workflow.rb
Uses VideoEmbedExtractor in carousel cards and proposal details: extracts video thumbnail for card images, includes video URL in detail body (prefixed), computes dynamic max body length via new calculate_max_body_length(video), and adjusts header_image selection accordingly.
Video Extractor Tests
spec/services/media/video_embed_extractor_spec.rb
Adds comprehensive RSpec coverage for blank/missing iframes, various YouTube and Vimeo embed patterns (including short URLs and query params), multiple iframes, mixed content, URL normalization, and thumbnail expectations.
Proposals Workflow Tests
spec/services/workflows/proposals_workflow_spec.rb
Adds contexts verifying proposal detail behavior for YouTube iframe (video URL and thumbnail header), Vimeo iframe (video URL and image fallback), and no iframe (existing interactive_buttons behavior), and asserts 1024-character body limits.

Sequence Diagram

sequenceDiagram
    participant ProposalsWorkflow as ProposalsWorkflow
    participant VideoExtractor as VideoEmbedExtractor
    participant Envelope as WhatsAppEnvelope
    participant Adapter as Adapter

    ProposalsWorkflow->>VideoExtractor: initialize(proposal.body)
    VideoExtractor->>VideoExtractor: parse HTML, find first iframe
    VideoExtractor->>VideoExtractor: detect provider (YouTube/Vimeo), extract ID
    VideoExtractor-->>ProposalsWorkflow: url, valid?, thumbnail_url

    alt Video Found
        ProposalsWorkflow->>ProposalsWorkflow: calculate_max_body_length(video)
        ProposalsWorkflow->>Envelope: build interactive_reply (header.video, body with URL, footer)
        Envelope->>Envelope: prioritize video > image > text
        Envelope-->>Adapter: send interactive_reply payload
    else No Video
        ProposalsWorkflow->>Envelope: build interactive_buttons (header.image/text, body, footer)
        Envelope-->>Adapter: send interactive_buttons payload
    end

    Adapter-->>User: WhatsApp message (video or standard)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 I chew through iframes, tail a-flutter bright,
I pull video links into WhatsApp light,
Thumbnails bloom where images once stood,
Short URLs and embeds parsed as they should,
Hooray — a rabbit hops, sharing videos tonight!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes out-of-scope changes to interactive_buttons.rb envelope handling that were not required by issue #16. Remove or justify changes to app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb as they fall outside the scope of issue #16.
Docstring Coverage ⚠️ Warning Docstring coverage is 29.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately captures the main changes: extracting video URLs and thumbnails from proposal iframes.
Linked Issues check ✅ Passed The PR implements all objectives from issue #16: VideoEmbedExtractor class created, integrated into ProposalsWorkflow, sanitize_text unchanged, and comprehensive tests added.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch extract-video

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for extracting and displaying video URLs from YouTube and Vimeo iframes embedded in proposal bodies. When a proposal contains a video, the chatbot displays a clickable video link with emoji and uses the video thumbnail as the header image in WhatsApp messages.

Changes:

  • Added VideoEmbedExtractor service to parse YouTube and Vimeo iframe embed URLs from HTML
  • Updated ProposalsWorkflow to detect videos, display URLs in message body, and use video thumbnails
  • Enhanced InteractiveButtons envelope to support video headers (though not actively used)
  • Added comprehensive test coverage for video extraction and workflow integration

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
app/services/decidim/chatbot/media/video_embed_extractor.rb New service class that extracts video URLs and thumbnails from YouTube/Vimeo iframes using regex patterns
app/services/decidim/chatbot/workflows/proposals_workflow.rb Integrates video detection in carousel cards and proposal details; implements dynamic body length calculation to respect 1024 char limit
app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb Adds header_video support and refactors header/footer conditional logic
spec/services/media/video_embed_extractor_spec.rb Comprehensive unit tests for VideoEmbedExtractor covering YouTube, Vimeo, and edge cases
spec/services/workflows/proposals_workflow_spec.rb Integration tests verifying video URL display, thumbnail usage, and character limit compliance

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +98 to +99
# Return available space for body text, with minimum of 100 chars
[total_limit - reserved_space, 100].max
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minimum body length of 100 characters can cause the total message to exceed the 1024 character limit. If reserved_space is greater than 924 (1024 - 100), forcing a minimum of 100 characters for body text will cause the total to exceed 1024. For example, if reserved_space is 950, the body gets 100 characters, totaling 1050 characters. The max should either be removed, or the logic should be: if reserved_space >= total_limit, return 0 to prevent exceeding the limit.

Suggested change
# Return available space for body text, with minimum of 100 chars
[total_limit - reserved_space, 100].max
# Return available space for body text, ensuring we never exceed the total limit
return 0 if reserved_space >= total_limit
total_limit - reserved_space

Copilot uses AI. Check for mistakes.
Comment on lines 60 to 91
@@ -67,6 +82,23 @@ def send_proposal_details
)
end

# Calculate maximum body length dynamically to stay within 1024 char limit
def calculate_max_body_length(video)
# WhatsApp body text limit is 1024 characters
total_limit = 1024

# Calculate fixed overhead
title_overhead = sanitize_text(proposal.title, 100).length + 6 # "*title*\n\n"
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method calls sanitize_text(proposal.title, 100) twice: once in calculate_max_body_length (line 91) to calculate overhead, and once in send_proposal_details (line 60) for the actual title_text. This is inefficient. Consider passing title_text as a parameter to calculate_max_body_length, or memoizing the result.

Copilot uses AI. Check for mistakes.
Comment on lines 86 to 100
def calculate_max_body_length(video)
# WhatsApp body text limit is 1024 characters
total_limit = 1024

# Calculate fixed overhead
title_overhead = sanitize_text(proposal.title, 100).length + 6 # "*title*\n\n"
video_overhead = video.valid? ? video.url.length + 6 : 0 # "🎥 url\n\n" (emoji is 2 bytes)
proposal_url_overhead = resource_url(proposal).length + 2 # "\n\nurl"

# Reserve space for newlines and formatting
reserved_space = title_overhead + video_overhead + proposal_url_overhead

# Return available space for body text, with minimum of 100 chars
[total_limit - reserved_space, 100].max
end
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calculate_max_body_length method accesses the proposal instance variable directly (lines 91, 93), making it harder to test in isolation and less clear about its dependencies. Consider refactoring to accept all needed values as parameters: calculate_max_body_length(video, title_text, proposal_url) for better encapsulation and testability.

Copilot uses AI. Check for mistakes.
microstudi and others added 5 commits February 25, 2026 19:37
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ractive_buttons.rb

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (5)
spec/services/media/video_embed_extractor_spec.rb (2)

150-160: www.player.vimeo.com is not a real Vimeo embed hostname — this test covers an impossible URL.

Vimeo's embed URL scheme is always https://player.vimeo.com/video/<id> without a www. prefix. www.player.vimeo.com does not resolve in practice. This test validates pattern matching for a URL that Decidim's rich text editor will never produce, giving false confidence in the extractor's real-world coverage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@spec/services/media/video_embed_extractor_spec.rb` around lines 150 - 160,
Test uses an invalid hostname (www.player.vimeo.com) that will never occur;
change the iframe src in the "with www subdomain" context to a realistic Vimeo
embed or remove the context. Specifically, update the html string in that
context to use "https://player.vimeo.com/video/987654321" (no "www.") so the
described_class.new(html) / extractor.url assertions exercise a real-world embed
pattern (or delete the whole context if you intended to test a non-existent
variant).

81-95: youtu.be iframe src test covers a URL pattern that is never generated by embedding tools.

YouTube iframes always use https://www.youtube.com/embed/<id> as src. The youtu.be/<id> shortlink is only ever used as an <a href> or clipboard URL, not as an iframe src. Neither Decidim's CKEditor integration nor any other standard embed mechanism produces this format. Keeping the test is harmless but creates a misleading impression of real-world coverage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@spec/services/media/video_embed_extractor_spec.rb` around lines 81 - 95, The
spec adds a test case for an iframe src pattern ("https://youtu.be/...") that
never occurs in real embeds; update the test in
spec/services/media/video_embed_extractor_spec.rb to use the realistic iframe
src "https://www.youtube.com/embed/<id>" (or convert the example to an
anchor/href test if you want to cover shortlink handling) so the described_class
extractor is exercised with a real-world embed pattern; ensure the examples that
call extractor.url, extractor.valid?, and extractor.thumbnail_url still assert
the same normalized watch URL and thumbnail but based on the
"/embed/dQw4w9WgXcQ" src rather than the youtu.be shortlink.
app/services/decidim/chatbot/media/video_embed_extractor.rb (2)

10-10: youtu.be is a sharing shortlink, not a valid iframe src — this branch of the pattern will never match real content.

YouTube iframes always use https://www.youtube.com/embed/<id> as the src. The youtu.be/<id> format is only used in hyperlinks; no browser or embed generator produces an iframe with a youtu.be src. Keeping the branch is harmless but the corresponding test (spec lines 81-95) exercises a scenario that cannot occur from real Decidim proposal content.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/services/decidim/chatbot/media/video_embed_extractor.rb` at line 10, The
YOUTUBE_PATTERN regex includes a branch matching "youtu.be" in iframe src which
is invalid for actual YouTube embeds; update the pattern defined by
YOUTUBE_PATTERN to remove the youtu.be alternative and only match legitimate
iframe src hosts (e.g., youtube.com/embed and youtube-nocookie.com/embed), and
then update or replace the spec that uses a youtu.be iframe scenario (the spec
exercising lines ~81-95) to use a real embed iframe URL so tests reflect
realistic Decidim content.

20-20: attr_reader :html unnecessarily widens the public API.

html is only read inside private methods. Making it public exposes internal state to callers without any benefit.

♻️ Move to private
-        attr_reader :html
-
         private
 
+        attr_reader :html
+
         def video_id
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/services/decidim/chatbot/media/video_embed_extractor.rb` at line 20, The
attr_reader :html is exposing internal state unnecessarily; move or restrict it
to private by placing attr_reader :html under the class's private section (or
replace usages with direct `@html` references inside private methods) so that only
internal methods (e.g., methods referencing html in VideoEmbedExtractor) can
access it; update the class definition to ensure html is not part of the public
API.
spec/services/workflows/proposals_workflow_spec.rb (1)

349-354: Body-length tests don't filter by message type — they'll match the first send_message! call regardless of whether it's the proposal details message.

Both body-length assertions use an unscoped expect(adapter).to receive(:send_message!) block with no hash_including(type: :interactive_buttons) guard. In the current flow send_proposal_details is the only call so this passes, but it is fragile: any future refactor that adds a preceding send_message! call would silently make these tests verify the wrong message.

🛡️ Recommended approach
-              it "respects the 1024 character body limit" do
-                expect(adapter).to receive(:send_message!) do |args|
-                  expect(args[:body_text].length).to be <= 1024
-                end
-                subject.start
-              end
+              it "respects the 1024 character body limit" do
+                expect(adapter).to receive(:send_message!).with(
+                  hash_including(type: :interactive_buttons, buttons: [hash_including(id: "comment-#{clicked_proposal.id}")])
+                ) do |args|
+                  expect(args[:body_text].length).to be <= 1024
+                end
+                subject.start
+              end

Also applies to: 433-438

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@spec/services/workflows/proposals_workflow_spec.rb` around lines 349 - 354,
The test's body-length expectations are matching the first send_message! call
regardless of message type; update both occurrences to scope the mock by
expecting adapter to receive(:send_message!).with(hash_including(type:
:interactive_buttons)) (or equivalent matcher) so the length assertion runs only
for the interactive_buttons/proposal-details message (reference send_message!
and the type: :interactive_buttons guard), and keep the inner block that asserts
args[:body_text].length <= 1024; apply the same change to the second instance
around lines 433-438.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/services/decidim/chatbot/media/video_embed_extractor.rb`:
- Around line 78-82: The youtube_thumbnail_url method currently returns the
maxresdefault URL even though that returns 404 for non‑HD videos; update
VideoEmbedExtractor#youtube_thumbnail_url to attempt the maxresdefault URL first
and, if it returns a non‑200 status, fall back to the hqdefault URL (use an HTTP
HEAD or lightweight request to check the response for the URL built from
video_id). Ensure the method returns the working thumbnail URL (prefer
maxresdefault, else hqdefault) and update the docstring to reflect the fallback
behavior.

In
`@app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb`:
- Around line 29-39: The footer hash is being created with footer[:text] =
data[:footer_text] inside the .tap, which yields { text: nil } when
data[:footer_text] is nil so interactive.delete(:footer) never runs; update the
interactive construction in interactive_buttons.rb to only add the :footer key
when data[:footer_text] is present (e.g., conditionally assign footer[:text] or
skip creating footer entirely), ensuring that interactive.delete(:footer) is no
longer responsible for removing a one-key hash with a nil value and that the
final payload omits :footer when data[:footer_text] is nil.

In `@app/services/decidim/chatbot/workflows/proposals_workflow.rb`:
- Line 69: In proposals_workflow (proposals_workflow.rb) the header_image
assignment uses video.thumbnail_url.presence || resource_url(proposal.photo)
which can return nil for Vimeo thumbnails or missing photos; update this to pass
fallback_image: true to the resource_url call (i.e.,
resource_url(proposal.photo, fallback_image: true)) so header_image never
becomes nil when video.thumbnail_url is absent and no photo is attached.
- Around line 86-100: The overhead calculations in calculate_max_body_length are
overstated: adjust the title_overhead and video_overhead math to reflect the
actual constructed strings ("*title*\n\n" is title.length + 4, and "🎥
#{video.url}\n\n" is url.length + 4) instead of using +6; update the expressions
that compute title_overhead (which currently uses sanitize_text(proposal.title,
100).length + 6) and video_overhead (which currently uses video.url.length + 6)
to use +4 so reserved_space correctly sums title_overhead, video_overhead, and
proposal_url_overhead when returning [total_limit - reserved_space, 100].max.

In `@spec/services/workflows/proposals_workflow_spec.rb`:
- Around line 339-347: The test in proposals_workflow_spec (the example
expecting header_image "maxresdefault.jpg") hard-codes the YouTube thumbnail
variant and will fail when youtube_thumbnail_url is changed; update the spec to
derive the expected thumbnail via the same helper/method used by the
implementation (e.g., call the model/service method youtube_thumbnail_url or use
the video fixture's thumbnail_url) instead of hard-coding "maxresdefault.jpg",
or assert on a more stable pattern (like matching the video ID and a default
.jpg) so the adapter.send_message! expectation (type: :interactive_buttons,
header_image: ...) stays in sync with the implementation when subject.start is
run.
- Around line 385-391: The call to resource_url(proposal.photo) in the test
should include fallback_image: true so it behaves the same as the resource_url
call inside send_cards; update the resource_url(proposal.photo) invocation (the
one used to set header_image in the test that exercises subject.start and
adapter.send_message!) to pass fallback_image: true so a placeholder URL is
returned when the photo attachment is missing, ensuring the
adapter.send_message! expectation for header_image being present remains valid.

---

Nitpick comments:
In `@app/services/decidim/chatbot/media/video_embed_extractor.rb`:
- Line 10: The YOUTUBE_PATTERN regex includes a branch matching "youtu.be" in
iframe src which is invalid for actual YouTube embeds; update the pattern
defined by YOUTUBE_PATTERN to remove the youtu.be alternative and only match
legitimate iframe src hosts (e.g., youtube.com/embed and
youtube-nocookie.com/embed), and then update or replace the spec that uses a
youtu.be iframe scenario (the spec exercising lines ~81-95) to use a real embed
iframe URL so tests reflect realistic Decidim content.
- Line 20: The attr_reader :html is exposing internal state unnecessarily; move
or restrict it to private by placing attr_reader :html under the class's private
section (or replace usages with direct `@html` references inside private methods)
so that only internal methods (e.g., methods referencing html in
VideoEmbedExtractor) can access it; update the class definition to ensure html
is not part of the public API.

In `@spec/services/media/video_embed_extractor_spec.rb`:
- Around line 150-160: Test uses an invalid hostname (www.player.vimeo.com) that
will never occur; change the iframe src in the "with www subdomain" context to a
realistic Vimeo embed or remove the context. Specifically, update the html
string in that context to use "https://player.vimeo.com/video/987654321" (no
"www.") so the described_class.new(html) / extractor.url assertions exercise a
real-world embed pattern (or delete the whole context if you intended to test a
non-existent variant).
- Around line 81-95: The spec adds a test case for an iframe src pattern
("https://youtu.be/...") that never occurs in real embeds; update the test in
spec/services/media/video_embed_extractor_spec.rb to use the realistic iframe
src "https://www.youtube.com/embed/<id>" (or convert the example to an
anchor/href test if you want to cover shortlink handling) so the described_class
extractor is exercised with a real-world embed pattern; ensure the examples that
call extractor.url, extractor.valid?, and extractor.thumbnail_url still assert
the same normalized watch URL and thumbnail but based on the
"/embed/dQw4w9WgXcQ" src rather than the youtu.be shortlink.

In `@spec/services/workflows/proposals_workflow_spec.rb`:
- Around line 349-354: The test's body-length expectations are matching the
first send_message! call regardless of message type; update both occurrences to
scope the mock by expecting adapter to
receive(:send_message!).with(hash_including(type: :interactive_buttons)) (or
equivalent matcher) so the length assertion runs only for the
interactive_buttons/proposal-details message (reference send_message! and the
type: :interactive_buttons guard), and keep the inner block that asserts
args[:body_text].length <= 1024; apply the same change to the second instance
around lines 433-438.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f716e7 and 0021c50.

📒 Files selected for processing (5)
  • app/services/decidim/chatbot/media/video_embed_extractor.rb
  • app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb
  • app/services/decidim/chatbot/workflows/proposals_workflow.rb
  • spec/services/media/video_embed_extractor_spec.rb
  • spec/services/workflows/proposals_workflow_spec.rb

microstudi and others added 2 commits February 25, 2026 19:40
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…d calls

Pass pre-calculated title_text and proposal_url as parameters to
calculate_max_body_length instead of recalculating them inside the method.
This eliminates redundant calls to sanitize_text() and resource_url().
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +105 to +106
# Return available space for body text, with minimum of 100 chars
[total_limit - reserved_space, 100].max
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the combined length of title, video URL, and proposal URL exceeds 924 characters (1024 - 100), the body text will always be truncated to 100 characters regardless of actual available space. This could result in the final message exceeding 1024 characters. For example, if title + video_url + proposal_url = 950 chars, the method returns 100 (not 74), and the final body would be 950 + 100 = 1050 chars. Consider using [total_limit - reserved_space, 0].max instead, or adding validation that the total message length respects the 1024 limit.

Suggested change
# Return available space for body text, with minimum of 100 chars
[total_limit - reserved_space, 100].max
# Return available space for body text, ensuring total length does not exceed the limit
[total_limit - reserved_space, 0].max

Copilot uses AI. Check for mistakes.
…n body length to 100

- Fixes failing tests that expected header_image to always be present
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
app/services/decidim/chatbot/media/video_embed_extractor.rb (1)

45-63: Redundant regex matching — video ID extraction re-evaluates patterns multiple times.

A single url call triggers youtube? (match 1), then extract_video_id calls youtube? again (match 2) plus html.match(YOUTUBE_PATTERN) (match 3). Similarly, thumbnail_url re-evaluates youtube?/vimeo?. Consider extracting the match result once and deriving both video_id and provider from it.

♻️ Suggested refactor — single-pass extraction
-        def video_id
-          `@video_id` ||= extract_video_id
-        end
-
-        def youtube?
-          html.match?(YOUTUBE_PATTERN)
-        end
-
-        def vimeo?
-          html.match?(VIMEO_PATTERN)
-        end
-
-        def extract_video_id
-          if youtube?
-            html.match(YOUTUBE_PATTERN)&.[](1)
-          elsif vimeo?
-            html.match(VIMEO_PATTERN)&.[](1)
-          end
-        end
+        def matched_data
+          `@matched_data` ||= begin
+            if (m = html&.match(YOUTUBE_PATTERN))
+              { provider: :youtube, id: m[1] }
+            elsif (m = html&.match(VIMEO_PATTERN))
+              { provider: :vimeo, id: m[1] }
+            end
+          end
+        end
+
+        def video_id
+          matched_data&.fetch(:id, nil)
+        end
+
+        def youtube?
+          matched_data&.fetch(:provider, nil) == :youtube
+        end
+
+        def vimeo?
+          matched_data&.fetch(:provider, nil) == :vimeo
+        end
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/services/decidim/chatbot/media/video_embed_extractor.rb` around lines 45
- 63, The current flow repeatedly runs regex matches via youtube?, vimeo?, and
extract_video_id; refactor to perform a single-pass extraction that caches both
the provider and the match result: add a private method (or memoized attribute)
like parsed_match or provider_match that runs html.match once against
YOUTUBE_PATTERN and VIMEO_PATTERN (in that order or via combined logic), set a
cached provider flag (e.g., :youtube or :vimeo) and extract the id from the
match; then update video_id to return the cached id and update youtube? and
vimeo? (and thumbnail_url if present) to consult the cached provider instead of
re-matching. Ensure unique symbols referenced: video_id, youtube?, vimeo?,
extract_video_id, and thumbnail_url.
spec/services/media/video_embed_extractor_spec.rb (1)

42-131: Consider adding an HTTP (non-HTTPS) iframe test.

The regex pattern uses https?:// to match both HTTP and HTTPS, but no test exercises the HTTP path. A quick test with http://www.youtube.com/embed/... would confirm this works as expected.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@spec/services/media/video_embed_extractor_spec.rb` around lines 42 - 131, Add
a spec that covers an HTTP iframe to exercise the regex's http path: create a
new context in spec/services/media/video_embed_extractor_spec.rb under the
YouTube iframe contexts that provides HTML like an iframe with
src="http://www.youtube.com/embed/dQw4w9WgXcQ" (also test query param and short
youtu.be variants if desired), then instantiate described_class (the
VideoEmbedExtractor used as extractor in tests) and assert extractor.url equals
"https://www.youtube.com/watch?v=dQw4w9WgXcQ", extractor.valid? is true, and
extractor.thumbnail_url equals
"https://img.youtube.com/vi/dQw4w9WgXcQ/hqdefault.jpg" so the HTTP scheme branch
of the pattern is verified.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/services/decidim/chatbot/workflows/proposals_workflow.rb`:
- Around line 93-107: The current calculate_max_body_length uses String#length
which counts characters, but WhatsApp enforces a UTF-16 code unit limit (emoji
like 🎥 are 2 units); update calculate_max_body_length to compute lengths in
UTF-16 code units instead of .length for the parts that contribute to the
overhead (title_text, video.url when video.valid?, and proposal_url) so
video_overhead correctly accounts for surrogate pairs; implement this by
replacing uses of .length with a UTF-16 code unit count (e.g., encode to
'UTF-16BE' and divide bytesize by 2) when computing title_overhead,
video_overhead, and proposal_url_overhead inside the calculate_max_body_length
method.

---

Nitpick comments:
In `@app/services/decidim/chatbot/media/video_embed_extractor.rb`:
- Around line 45-63: The current flow repeatedly runs regex matches via
youtube?, vimeo?, and extract_video_id; refactor to perform a single-pass
extraction that caches both the provider and the match result: add a private
method (or memoized attribute) like parsed_match or provider_match that runs
html.match once against YOUTUBE_PATTERN and VIMEO_PATTERN (in that order or via
combined logic), set a cached provider flag (e.g., :youtube or :vimeo) and
extract the id from the match; then update video_id to return the cached id and
update youtube? and vimeo? (and thumbnail_url if present) to consult the cached
provider instead of re-matching. Ensure unique symbols referenced: video_id,
youtube?, vimeo?, extract_video_id, and thumbnail_url.

In `@spec/services/media/video_embed_extractor_spec.rb`:
- Around line 42-131: Add a spec that covers an HTTP iframe to exercise the
regex's http path: create a new context in
spec/services/media/video_embed_extractor_spec.rb under the YouTube iframe
contexts that provides HTML like an iframe with
src="http://www.youtube.com/embed/dQw4w9WgXcQ" (also test query param and short
youtu.be variants if desired), then instantiate described_class (the
VideoEmbedExtractor used as extractor in tests) and assert extractor.url equals
"https://www.youtube.com/watch?v=dQw4w9WgXcQ", extractor.valid? is true, and
extractor.thumbnail_url equals
"https://img.youtube.com/vi/dQw4w9WgXcQ/hqdefault.jpg" so the HTTP scheme branch
of the pattern is verified.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0021c50 and 43aba7b.

📒 Files selected for processing (5)
  • app/services/decidim/chatbot/media/video_embed_extractor.rb
  • app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb
  • app/services/decidim/chatbot/workflows/proposals_workflow.rb
  • spec/services/media/video_embed_extractor_spec.rb
  • spec/services/workflows/proposals_workflow_spec.rb
🚧 Files skipped from review as they are similar to previous changes (1)
  • spec/services/workflows/proposals_workflow_spec.rb

Comment on lines +93 to +107
def calculate_max_body_length(video, title_text, proposal_url)
# WhatsApp body text limit is 1024 characters
total_limit = 1024

# Calculate fixed overhead using pre-calculated values
title_overhead = title_text.length + 4 # "*title*\n\n"
video_overhead = video.valid? ? video.url.length + 4 : 0 # "🎥 url\n\n"
proposal_url_overhead = proposal_url.length + 2 # "\n\nurl"

# Reserve space for newlines and formatting
reserved_space = title_overhead + video_overhead + proposal_url_overhead

# Return available space for body text, with minimum of 100 chars
[total_limit - reserved_space, 100].max
end
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Emoji 🎥 may count as 2 characters in WhatsApp's UTF-16 encoding.

Ruby's String#length counts 🎥 as 1 character, but WhatsApp uses UTF-16 encoding where this emoji (U+1F3A5, outside BMP) occupies 2 code units. The video_overhead calculation at line 99 would undercount by 1, potentially pushing the total body to 1025 characters in edge cases.

A safe fix:

🛡️ Proposed fix
-          video_overhead = video.valid? ? video.url.length + 4 : 0 # "🎥 url\n\n"
+          video_overhead = video.valid? ? video.url.length + 5 : 0 # "🎥 url\n\n" (emoji is 2 UTF-16 code units)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/services/decidim/chatbot/workflows/proposals_workflow.rb` around lines 93
- 107, The current calculate_max_body_length uses String#length which counts
characters, but WhatsApp enforces a UTF-16 code unit limit (emoji like 🎥 are 2
units); update calculate_max_body_length to compute lengths in UTF-16 code units
instead of .length for the parts that contribute to the overhead (title_text,
video.url when video.valid?, and proposal_url) so video_overhead correctly
accounts for surrogate pairs; implement this by replacing uses of .length with a
UTF-16 code unit count (e.g., encode to 'UTF-16BE' and divide bytesize by 2)
when computing title_overhead, video_overhead, and proposal_url_overhead inside
the calculate_max_body_length method.

@microstudi microstudi merged commit ce90378 into main Feb 25, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use embedded video iframes in proposal details (WhatsApp interactive reply)

2 participants