Extract video URLs and thumbnails from proposal iframes#28
Conversation
📝 WalkthroughWalkthroughAdds a VideoEmbedExtractor service to parse iframe HTML for YouTube and Vimeo, integrates video detection into the proposals workflow to include video URLs and thumbnails in WhatsApp messages, updates WhatsApp interactive envelope header handling to support video, and adds tests for extractor and workflow behaviors. Changes
Sequence DiagramsequenceDiagram
participant ProposalsWorkflow as ProposalsWorkflow
participant VideoExtractor as VideoEmbedExtractor
participant Envelope as WhatsAppEnvelope
participant Adapter as Adapter
ProposalsWorkflow->>VideoExtractor: initialize(proposal.body)
VideoExtractor->>VideoExtractor: parse HTML, find first iframe
VideoExtractor->>VideoExtractor: detect provider (YouTube/Vimeo), extract ID
VideoExtractor-->>ProposalsWorkflow: url, valid?, thumbnail_url
alt Video Found
ProposalsWorkflow->>ProposalsWorkflow: calculate_max_body_length(video)
ProposalsWorkflow->>Envelope: build interactive_reply (header.video, body with URL, footer)
Envelope->>Envelope: prioritize video > image > text
Envelope-->>Adapter: send interactive_reply payload
else No Video
ProposalsWorkflow->>Envelope: build interactive_buttons (header.image/text, body, footer)
Envelope-->>Adapter: send interactive_buttons payload
end
Adapter-->>User: WhatsApp message (video or standard)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds support for extracting and displaying video URLs from YouTube and Vimeo iframes embedded in proposal bodies. When a proposal contains a video, the chatbot displays a clickable video link with emoji and uses the video thumbnail as the header image in WhatsApp messages.
Changes:
- Added
VideoEmbedExtractorservice to parse YouTube and Vimeo iframe embed URLs from HTML - Updated
ProposalsWorkflowto detect videos, display URLs in message body, and use video thumbnails - Enhanced
InteractiveButtonsenvelope to support video headers (though not actively used) - Added comprehensive test coverage for video extraction and workflow integration
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| app/services/decidim/chatbot/media/video_embed_extractor.rb | New service class that extracts video URLs and thumbnails from YouTube/Vimeo iframes using regex patterns |
| app/services/decidim/chatbot/workflows/proposals_workflow.rb | Integrates video detection in carousel cards and proposal details; implements dynamic body length calculation to respect 1024 char limit |
| app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb | Adds header_video support and refactors header/footer conditional logic |
| spec/services/media/video_embed_extractor_spec.rb | Comprehensive unit tests for VideoEmbedExtractor covering YouTube, Vimeo, and edge cases |
| spec/services/workflows/proposals_workflow_spec.rb | Integration tests verifying video URL display, thumbnail usage, and character limit compliance |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb
Outdated
Show resolved
Hide resolved
| # Return available space for body text, with minimum of 100 chars | ||
| [total_limit - reserved_space, 100].max |
There was a problem hiding this comment.
The minimum body length of 100 characters can cause the total message to exceed the 1024 character limit. If reserved_space is greater than 924 (1024 - 100), forcing a minimum of 100 characters for body text will cause the total to exceed 1024. For example, if reserved_space is 950, the body gets 100 characters, totaling 1050 characters. The max should either be removed, or the logic should be: if reserved_space >= total_limit, return 0 to prevent exceeding the limit.
| # Return available space for body text, with minimum of 100 chars | |
| [total_limit - reserved_space, 100].max | |
| # Return available space for body text, ensuring we never exceed the total limit | |
| return 0 if reserved_space >= total_limit | |
| total_limit - reserved_space |
| @@ -67,6 +82,23 @@ def send_proposal_details | |||
| ) | |||
| end | |||
|
|
|||
| # Calculate maximum body length dynamically to stay within 1024 char limit | |||
| def calculate_max_body_length(video) | |||
| # WhatsApp body text limit is 1024 characters | |||
| total_limit = 1024 | |||
|
|
|||
| # Calculate fixed overhead | |||
| title_overhead = sanitize_text(proposal.title, 100).length + 6 # "*title*\n\n" | |||
There was a problem hiding this comment.
The method calls sanitize_text(proposal.title, 100) twice: once in calculate_max_body_length (line 91) to calculate overhead, and once in send_proposal_details (line 60) for the actual title_text. This is inefficient. Consider passing title_text as a parameter to calculate_max_body_length, or memoizing the result.
| def calculate_max_body_length(video) | ||
| # WhatsApp body text limit is 1024 characters | ||
| total_limit = 1024 | ||
|
|
||
| # Calculate fixed overhead | ||
| title_overhead = sanitize_text(proposal.title, 100).length + 6 # "*title*\n\n" | ||
| video_overhead = video.valid? ? video.url.length + 6 : 0 # "🎥 url\n\n" (emoji is 2 bytes) | ||
| proposal_url_overhead = resource_url(proposal).length + 2 # "\n\nurl" | ||
|
|
||
| # Reserve space for newlines and formatting | ||
| reserved_space = title_overhead + video_overhead + proposal_url_overhead | ||
|
|
||
| # Return available space for body text, with minimum of 100 chars | ||
| [total_limit - reserved_space, 100].max | ||
| end |
There was a problem hiding this comment.
The calculate_max_body_length method accesses the proposal instance variable directly (lines 91, 93), making it harder to test in isolation and less clear about its dependencies. Consider refactoring to accept all needed values as parameters: calculate_max_body_length(video, title_text, proposal_url) for better encapsulation and testability.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ractive_buttons.rb Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (5)
spec/services/media/video_embed_extractor_spec.rb (2)
150-160:www.player.vimeo.comis not a real Vimeo embed hostname — this test covers an impossible URL.Vimeo's embed URL scheme is always
https://player.vimeo.com/video/<id>without awww.prefix.www.player.vimeo.comdoes not resolve in practice. This test validates pattern matching for a URL that Decidim's rich text editor will never produce, giving false confidence in the extractor's real-world coverage.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spec/services/media/video_embed_extractor_spec.rb` around lines 150 - 160, Test uses an invalid hostname (www.player.vimeo.com) that will never occur; change the iframe src in the "with www subdomain" context to a realistic Vimeo embed or remove the context. Specifically, update the html string in that context to use "https://player.vimeo.com/video/987654321" (no "www.") so the described_class.new(html) / extractor.url assertions exercise a real-world embed pattern (or delete the whole context if you intended to test a non-existent variant).
81-95:youtu.beiframesrctest covers a URL pattern that is never generated by embedding tools.YouTube iframes always use
https://www.youtube.com/embed/<id>assrc. Theyoutu.be/<id>shortlink is only ever used as an<a href>or clipboard URL, not as an iframesrc. Neither Decidim's CKEditor integration nor any other standard embed mechanism produces this format. Keeping the test is harmless but creates a misleading impression of real-world coverage.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spec/services/media/video_embed_extractor_spec.rb` around lines 81 - 95, The spec adds a test case for an iframe src pattern ("https://youtu.be/...") that never occurs in real embeds; update the test in spec/services/media/video_embed_extractor_spec.rb to use the realistic iframe src "https://www.youtube.com/embed/<id>" (or convert the example to an anchor/href test if you want to cover shortlink handling) so the described_class extractor is exercised with a real-world embed pattern; ensure the examples that call extractor.url, extractor.valid?, and extractor.thumbnail_url still assert the same normalized watch URL and thumbnail but based on the "/embed/dQw4w9WgXcQ" src rather than the youtu.be shortlink.app/services/decidim/chatbot/media/video_embed_extractor.rb (2)
10-10:youtu.beis a sharing shortlink, not a valid iframesrc— this branch of the pattern will never match real content.YouTube iframes always use
https://www.youtube.com/embed/<id>as thesrc. Theyoutu.be/<id>format is only used in hyperlinks; no browser or embed generator produces an iframe with ayoutu.besrc. Keeping the branch is harmless but the corresponding test (spec lines 81-95) exercises a scenario that cannot occur from real Decidim proposal content.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/decidim/chatbot/media/video_embed_extractor.rb` at line 10, The YOUTUBE_PATTERN regex includes a branch matching "youtu.be" in iframe src which is invalid for actual YouTube embeds; update the pattern defined by YOUTUBE_PATTERN to remove the youtu.be alternative and only match legitimate iframe src hosts (e.g., youtube.com/embed and youtube-nocookie.com/embed), and then update or replace the spec that uses a youtu.be iframe scenario (the spec exercising lines ~81-95) to use a real embed iframe URL so tests reflect realistic Decidim content.
20-20:attr_reader :htmlunnecessarily widens the public API.
htmlis only read inside private methods. Making it public exposes internal state to callers without any benefit.♻️ Move to private
- attr_reader :html - private + attr_reader :html + def video_id🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/decidim/chatbot/media/video_embed_extractor.rb` at line 20, The attr_reader :html is exposing internal state unnecessarily; move or restrict it to private by placing attr_reader :html under the class's private section (or replace usages with direct `@html` references inside private methods) so that only internal methods (e.g., methods referencing html in VideoEmbedExtractor) can access it; update the class definition to ensure html is not part of the public API.spec/services/workflows/proposals_workflow_spec.rb (1)
349-354: Body-length tests don't filter by message type — they'll match the firstsend_message!call regardless of whether it's the proposal details message.Both body-length assertions use an unscoped
expect(adapter).to receive(:send_message!)block with nohash_including(type: :interactive_buttons)guard. In the current flowsend_proposal_detailsis the only call so this passes, but it is fragile: any future refactor that adds a precedingsend_message!call would silently make these tests verify the wrong message.🛡️ Recommended approach
- it "respects the 1024 character body limit" do - expect(adapter).to receive(:send_message!) do |args| - expect(args[:body_text].length).to be <= 1024 - end - subject.start - end + it "respects the 1024 character body limit" do + expect(adapter).to receive(:send_message!).with( + hash_including(type: :interactive_buttons, buttons: [hash_including(id: "comment-#{clicked_proposal.id}")]) + ) do |args| + expect(args[:body_text].length).to be <= 1024 + end + subject.start + endAlso applies to: 433-438
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spec/services/workflows/proposals_workflow_spec.rb` around lines 349 - 354, The test's body-length expectations are matching the first send_message! call regardless of message type; update both occurrences to scope the mock by expecting adapter to receive(:send_message!).with(hash_including(type: :interactive_buttons)) (or equivalent matcher) so the length assertion runs only for the interactive_buttons/proposal-details message (reference send_message! and the type: :interactive_buttons guard), and keep the inner block that asserts args[:body_text].length <= 1024; apply the same change to the second instance around lines 433-438.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/services/decidim/chatbot/media/video_embed_extractor.rb`:
- Around line 78-82: The youtube_thumbnail_url method currently returns the
maxresdefault URL even though that returns 404 for non‑HD videos; update
VideoEmbedExtractor#youtube_thumbnail_url to attempt the maxresdefault URL first
and, if it returns a non‑200 status, fall back to the hqdefault URL (use an HTTP
HEAD or lightweight request to check the response for the URL built from
video_id). Ensure the method returns the working thumbnail URL (prefer
maxresdefault, else hqdefault) and update the docstring to reflect the fallback
behavior.
In
`@app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb`:
- Around line 29-39: The footer hash is being created with footer[:text] =
data[:footer_text] inside the .tap, which yields { text: nil } when
data[:footer_text] is nil so interactive.delete(:footer) never runs; update the
interactive construction in interactive_buttons.rb to only add the :footer key
when data[:footer_text] is present (e.g., conditionally assign footer[:text] or
skip creating footer entirely), ensuring that interactive.delete(:footer) is no
longer responsible for removing a one-key hash with a nil value and that the
final payload omits :footer when data[:footer_text] is nil.
In `@app/services/decidim/chatbot/workflows/proposals_workflow.rb`:
- Line 69: In proposals_workflow (proposals_workflow.rb) the header_image
assignment uses video.thumbnail_url.presence || resource_url(proposal.photo)
which can return nil for Vimeo thumbnails or missing photos; update this to pass
fallback_image: true to the resource_url call (i.e.,
resource_url(proposal.photo, fallback_image: true)) so header_image never
becomes nil when video.thumbnail_url is absent and no photo is attached.
- Around line 86-100: The overhead calculations in calculate_max_body_length are
overstated: adjust the title_overhead and video_overhead math to reflect the
actual constructed strings ("*title*\n\n" is title.length + 4, and "🎥
#{video.url}\n\n" is url.length + 4) instead of using +6; update the expressions
that compute title_overhead (which currently uses sanitize_text(proposal.title,
100).length + 6) and video_overhead (which currently uses video.url.length + 6)
to use +4 so reserved_space correctly sums title_overhead, video_overhead, and
proposal_url_overhead when returning [total_limit - reserved_space, 100].max.
In `@spec/services/workflows/proposals_workflow_spec.rb`:
- Around line 339-347: The test in proposals_workflow_spec (the example
expecting header_image "maxresdefault.jpg") hard-codes the YouTube thumbnail
variant and will fail when youtube_thumbnail_url is changed; update the spec to
derive the expected thumbnail via the same helper/method used by the
implementation (e.g., call the model/service method youtube_thumbnail_url or use
the video fixture's thumbnail_url) instead of hard-coding "maxresdefault.jpg",
or assert on a more stable pattern (like matching the video ID and a default
.jpg) so the adapter.send_message! expectation (type: :interactive_buttons,
header_image: ...) stays in sync with the implementation when subject.start is
run.
- Around line 385-391: The call to resource_url(proposal.photo) in the test
should include fallback_image: true so it behaves the same as the resource_url
call inside send_cards; update the resource_url(proposal.photo) invocation (the
one used to set header_image in the test that exercises subject.start and
adapter.send_message!) to pass fallback_image: true so a placeholder URL is
returned when the photo attachment is missing, ensuring the
adapter.send_message! expectation for header_image being present remains valid.
---
Nitpick comments:
In `@app/services/decidim/chatbot/media/video_embed_extractor.rb`:
- Line 10: The YOUTUBE_PATTERN regex includes a branch matching "youtu.be" in
iframe src which is invalid for actual YouTube embeds; update the pattern
defined by YOUTUBE_PATTERN to remove the youtu.be alternative and only match
legitimate iframe src hosts (e.g., youtube.com/embed and
youtube-nocookie.com/embed), and then update or replace the spec that uses a
youtu.be iframe scenario (the spec exercising lines ~81-95) to use a real embed
iframe URL so tests reflect realistic Decidim content.
- Line 20: The attr_reader :html is exposing internal state unnecessarily; move
or restrict it to private by placing attr_reader :html under the class's private
section (or replace usages with direct `@html` references inside private methods)
so that only internal methods (e.g., methods referencing html in
VideoEmbedExtractor) can access it; update the class definition to ensure html
is not part of the public API.
In `@spec/services/media/video_embed_extractor_spec.rb`:
- Around line 150-160: Test uses an invalid hostname (www.player.vimeo.com) that
will never occur; change the iframe src in the "with www subdomain" context to a
realistic Vimeo embed or remove the context. Specifically, update the html
string in that context to use "https://player.vimeo.com/video/987654321" (no
"www.") so the described_class.new(html) / extractor.url assertions exercise a
real-world embed pattern (or delete the whole context if you intended to test a
non-existent variant).
- Around line 81-95: The spec adds a test case for an iframe src pattern
("https://youtu.be/...") that never occurs in real embeds; update the test in
spec/services/media/video_embed_extractor_spec.rb to use the realistic iframe
src "https://www.youtube.com/embed/<id>" (or convert the example to an
anchor/href test if you want to cover shortlink handling) so the described_class
extractor is exercised with a real-world embed pattern; ensure the examples that
call extractor.url, extractor.valid?, and extractor.thumbnail_url still assert
the same normalized watch URL and thumbnail but based on the
"/embed/dQw4w9WgXcQ" src rather than the youtu.be shortlink.
In `@spec/services/workflows/proposals_workflow_spec.rb`:
- Around line 349-354: The test's body-length expectations are matching the
first send_message! call regardless of message type; update both occurrences to
scope the mock by expecting adapter to
receive(:send_message!).with(hash_including(type: :interactive_buttons)) (or
equivalent matcher) so the length assertion runs only for the
interactive_buttons/proposal-details message (reference send_message! and the
type: :interactive_buttons guard), and keep the inner block that asserts
args[:body_text].length <= 1024; apply the same change to the second instance
around lines 433-438.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
app/services/decidim/chatbot/media/video_embed_extractor.rbapp/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rbapp/services/decidim/chatbot/workflows/proposals_workflow.rbspec/services/media/video_embed_extractor_spec.rbspec/services/workflows/proposals_workflow_spec.rb
app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
5294f37 to
5956861
Compare
…d calls Pass pre-calculated title_text and proposal_url as parameters to calculate_max_body_length instead of recalculating them inside the method. This eliminates redundant calls to sanitize_text() and resource_url().
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rb
Outdated
Show resolved
Hide resolved
| # Return available space for body text, with minimum of 100 chars | ||
| [total_limit - reserved_space, 100].max |
There was a problem hiding this comment.
If the combined length of title, video URL, and proposal URL exceeds 924 characters (1024 - 100), the body text will always be truncated to 100 characters regardless of actual available space. This could result in the final message exceeding 1024 characters. For example, if title + video_url + proposal_url = 950 chars, the method returns 100 (not 74), and the final body would be 950 + 100 = 1050 chars. Consider using [total_limit - reserved_space, 0].max instead, or adding validation that the total message length respects the 1024 limit.
| # Return available space for body text, with minimum of 100 chars | |
| [total_limit - reserved_space, 100].max | |
| # Return available space for body text, ensuring total length does not exceed the limit | |
| [total_limit - reserved_space, 0].max |
…n body length to 100 - Fixes failing tests that expected header_image to always be present
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
app/services/decidim/chatbot/media/video_embed_extractor.rb (1)
45-63: Redundant regex matching — video ID extraction re-evaluates patterns multiple times.A single
urlcall triggersyoutube?(match 1), thenextract_video_idcallsyoutube?again (match 2) plushtml.match(YOUTUBE_PATTERN)(match 3). Similarly,thumbnail_urlre-evaluatesyoutube?/vimeo?. Consider extracting the match result once and deriving bothvideo_idand provider from it.♻️ Suggested refactor — single-pass extraction
- def video_id - `@video_id` ||= extract_video_id - end - - def youtube? - html.match?(YOUTUBE_PATTERN) - end - - def vimeo? - html.match?(VIMEO_PATTERN) - end - - def extract_video_id - if youtube? - html.match(YOUTUBE_PATTERN)&.[](1) - elsif vimeo? - html.match(VIMEO_PATTERN)&.[](1) - end - end + def matched_data + `@matched_data` ||= begin + if (m = html&.match(YOUTUBE_PATTERN)) + { provider: :youtube, id: m[1] } + elsif (m = html&.match(VIMEO_PATTERN)) + { provider: :vimeo, id: m[1] } + end + end + end + + def video_id + matched_data&.fetch(:id, nil) + end + + def youtube? + matched_data&.fetch(:provider, nil) == :youtube + end + + def vimeo? + matched_data&.fetch(:provider, nil) == :vimeo + end🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/decidim/chatbot/media/video_embed_extractor.rb` around lines 45 - 63, The current flow repeatedly runs regex matches via youtube?, vimeo?, and extract_video_id; refactor to perform a single-pass extraction that caches both the provider and the match result: add a private method (or memoized attribute) like parsed_match or provider_match that runs html.match once against YOUTUBE_PATTERN and VIMEO_PATTERN (in that order or via combined logic), set a cached provider flag (e.g., :youtube or :vimeo) and extract the id from the match; then update video_id to return the cached id and update youtube? and vimeo? (and thumbnail_url if present) to consult the cached provider instead of re-matching. Ensure unique symbols referenced: video_id, youtube?, vimeo?, extract_video_id, and thumbnail_url.spec/services/media/video_embed_extractor_spec.rb (1)
42-131: Consider adding an HTTP (non-HTTPS) iframe test.The regex pattern uses
https?://to match both HTTP and HTTPS, but no test exercises the HTTP path. A quick test withhttp://www.youtube.com/embed/...would confirm this works as expected.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spec/services/media/video_embed_extractor_spec.rb` around lines 42 - 131, Add a spec that covers an HTTP iframe to exercise the regex's http path: create a new context in spec/services/media/video_embed_extractor_spec.rb under the YouTube iframe contexts that provides HTML like an iframe with src="http://www.youtube.com/embed/dQw4w9WgXcQ" (also test query param and short youtu.be variants if desired), then instantiate described_class (the VideoEmbedExtractor used as extractor in tests) and assert extractor.url equals "https://www.youtube.com/watch?v=dQw4w9WgXcQ", extractor.valid? is true, and extractor.thumbnail_url equals "https://img.youtube.com/vi/dQw4w9WgXcQ/hqdefault.jpg" so the HTTP scheme branch of the pattern is verified.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/services/decidim/chatbot/workflows/proposals_workflow.rb`:
- Around line 93-107: The current calculate_max_body_length uses String#length
which counts characters, but WhatsApp enforces a UTF-16 code unit limit (emoji
like 🎥 are 2 units); update calculate_max_body_length to compute lengths in
UTF-16 code units instead of .length for the parts that contribute to the
overhead (title_text, video.url when video.valid?, and proposal_url) so
video_overhead correctly accounts for surrogate pairs; implement this by
replacing uses of .length with a UTF-16 code unit count (e.g., encode to
'UTF-16BE' and divide bytesize by 2) when computing title_overhead,
video_overhead, and proposal_url_overhead inside the calculate_max_body_length
method.
---
Nitpick comments:
In `@app/services/decidim/chatbot/media/video_embed_extractor.rb`:
- Around line 45-63: The current flow repeatedly runs regex matches via
youtube?, vimeo?, and extract_video_id; refactor to perform a single-pass
extraction that caches both the provider and the match result: add a private
method (or memoized attribute) like parsed_match or provider_match that runs
html.match once against YOUTUBE_PATTERN and VIMEO_PATTERN (in that order or via
combined logic), set a cached provider flag (e.g., :youtube or :vimeo) and
extract the id from the match; then update video_id to return the cached id and
update youtube? and vimeo? (and thumbnail_url if present) to consult the cached
provider instead of re-matching. Ensure unique symbols referenced: video_id,
youtube?, vimeo?, extract_video_id, and thumbnail_url.
In `@spec/services/media/video_embed_extractor_spec.rb`:
- Around line 42-131: Add a spec that covers an HTTP iframe to exercise the
regex's http path: create a new context in
spec/services/media/video_embed_extractor_spec.rb under the YouTube iframe
contexts that provides HTML like an iframe with
src="http://www.youtube.com/embed/dQw4w9WgXcQ" (also test query param and short
youtu.be variants if desired), then instantiate described_class (the
VideoEmbedExtractor used as extractor in tests) and assert extractor.url equals
"https://www.youtube.com/watch?v=dQw4w9WgXcQ", extractor.valid? is true, and
extractor.thumbnail_url equals
"https://img.youtube.com/vi/dQw4w9WgXcQ/hqdefault.jpg" so the HTTP scheme branch
of the pattern is verified.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
app/services/decidim/chatbot/media/video_embed_extractor.rbapp/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_buttons.rbapp/services/decidim/chatbot/workflows/proposals_workflow.rbspec/services/media/video_embed_extractor_spec.rbspec/services/workflows/proposals_workflow_spec.rb
🚧 Files skipped from review as they are similar to previous changes (1)
- spec/services/workflows/proposals_workflow_spec.rb
| def calculate_max_body_length(video, title_text, proposal_url) | ||
| # WhatsApp body text limit is 1024 characters | ||
| total_limit = 1024 | ||
|
|
||
| # Calculate fixed overhead using pre-calculated values | ||
| title_overhead = title_text.length + 4 # "*title*\n\n" | ||
| video_overhead = video.valid? ? video.url.length + 4 : 0 # "🎥 url\n\n" | ||
| proposal_url_overhead = proposal_url.length + 2 # "\n\nurl" | ||
|
|
||
| # Reserve space for newlines and formatting | ||
| reserved_space = title_overhead + video_overhead + proposal_url_overhead | ||
|
|
||
| # Return available space for body text, with minimum of 100 chars | ||
| [total_limit - reserved_space, 100].max | ||
| end |
There was a problem hiding this comment.
Emoji 🎥 may count as 2 characters in WhatsApp's UTF-16 encoding.
Ruby's String#length counts 🎥 as 1 character, but WhatsApp uses UTF-16 encoding where this emoji (U+1F3A5, outside BMP) occupies 2 code units. The video_overhead calculation at line 99 would undercount by 1, potentially pushing the total body to 1025 characters in edge cases.
A safe fix:
🛡️ Proposed fix
- video_overhead = video.valid? ? video.url.length + 4 : 0 # "🎥 url\n\n"
+ video_overhead = video.valid? ? video.url.length + 5 : 0 # "🎥 url\n\n" (emoji is 2 UTF-16 code units)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/services/decidim/chatbot/workflows/proposals_workflow.rb` around lines 93
- 107, The current calculate_max_body_length uses String#length which counts
characters, but WhatsApp enforces a UTF-16 code unit limit (emoji like 🎥 are 2
units); update calculate_max_body_length to compute lengths in UTF-16 code units
instead of .length for the parts that contribute to the overhead (title_text,
video.url when video.valid?, and proposal_url) so video_overhead correctly
accounts for surrogate pairs; implement this by replacing uses of .length with a
UTF-16 code unit count (e.g., encode to 'UTF-16BE' and divide bytesize by 2)
when computing title_overhead, video_overhead, and proposal_url_overhead inside
the calculate_max_body_length method.
Summary
This PR implements issue #16 - Use embedded video iframes in proposal details (WhatsApp interactive reply).
Changes
1. Video URL & Thumbnail Extraction
VideoEmbedExtractorservice class to parse iframe URLs from proposal bodies2. Updated ProposalsWorkflow
3. Comprehensive Tests
VideoEmbedExtractorcovering YouTube, Vimeo, and edge casesTechnical Notes
.mp4/.3gppfiles)Testing
All tests pass and cover:
Closes #16
Summary by CodeRabbit
New Features
Tests