-
Notifications
You must be signed in to change notification settings - Fork 3
Extract video URLs and thumbnails from proposal iframes #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0021c50
cbdd6fd
ef7b83d
ec8e7db
c7e1649
0eebe12
b777090
5956861
3f480fb
43aba7b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,95 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| module Decidim | ||
| module Chatbot | ||
| module Media | ||
| # Service class to extract video URLs from HTML iframes | ||
| # Supports YouTube and Vimeo embeds | ||
| class VideoEmbedExtractor | ||
| # Regex patterns to match iframe src attributes for supported video platforms | ||
| YOUTUBE_PATTERN = %r{<iframe[^>]+src=["\']https?://(?:www\.)?(?:youtube\.com/embed/|youtu\.be/|youtube-nocookie\.com/embed/)([a-zA-Z0-9_-]+)(?:[?&][^"\']*)?["\'][^>]*>}i | ||
| VIMEO_PATTERN = %r{<iframe[^>]+src=["\']https?://(?:www\.)?player\.vimeo\.com/video/(\d+)(?:[?&][^"\']*)?["\'][^>]*>}i | ||
|
|
||
| # Initializes a new extractor with the given HTML content | ||
| # @param html [String] The HTML string to parse for embedded videos | ||
| def initialize(html) | ||
| @html = html | ||
| end | ||
|
|
||
| attr_reader :html | ||
|
|
||
| def url | ||
| return nil if html.blank? | ||
|
|
||
| @url ||= extract_youtube || extract_vimeo | ||
| end | ||
|
|
||
| def valid? | ||
| url.present? | ||
| end | ||
|
|
||
| # Returns the video thumbnail URL | ||
| # @return [String, nil] The thumbnail URL or nil if no video found | ||
| def thumbnail_url | ||
| return nil unless valid? | ||
|
|
||
| @thumbnail_url ||= if youtube? | ||
| youtube_thumbnail_url | ||
| elsif vimeo? | ||
| vimeo_thumbnail_url | ||
| end | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def video_id | ||
| @video_id ||= extract_video_id | ||
| end | ||
|
|
||
| def youtube? | ||
| html.match?(YOUTUBE_PATTERN) | ||
| end | ||
|
|
||
| def vimeo? | ||
| html.match?(VIMEO_PATTERN) | ||
| end | ||
|
|
||
| def extract_video_id | ||
| if youtube? | ||
| html.match(YOUTUBE_PATTERN)&.[](1) | ||
| elsif vimeo? | ||
| html.match(VIMEO_PATTERN)&.[](1) | ||
| end | ||
| end | ||
|
|
||
| def extract_youtube | ||
| return nil unless youtube? | ||
|
|
||
| "https://www.youtube.com/watch?v=#{video_id}" | ||
| end | ||
|
|
||
| def extract_vimeo | ||
| return nil unless vimeo? | ||
|
|
||
| "https://vimeo.com/#{video_id}" | ||
| end | ||
|
|
||
| # Returns YouTube thumbnail URL | ||
| def youtube_thumbnail_url | ||
| # hqdefault (480Γ360) is guaranteed to exist for every published video. | ||
| # maxresdefault (1280Γ720) is only generated for HD uploads and 404s otherwise. | ||
| "https://img.youtube.com/vi/#{video_id}/hqdefault.jpg" | ||
| end | ||
microstudi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Returns Vimeo thumbnail URL using oEmbed API pattern | ||
| # Note: This could be enhanced to fetch actual thumbnail via HTTP request | ||
| def vimeo_thumbnail_url | ||
| # Vimeo doesn't have a predictable thumbnail URL pattern like YouTube | ||
| # We'd need to make an API call to get it, but for now return nil | ||
| # or fetch it via: https://vimeo.com/api/oembed.json?url=https://vimeo.com/{video_id} | ||
| nil | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -39,11 +39,12 @@ def send_cards | |||||||||||||||||||||
| type: :interactive_carousel, | ||||||||||||||||||||||
| body_text: body, | ||||||||||||||||||||||
| cards: current_proposals.map do |proposal| | ||||||||||||||||||||||
| video = Decidim::Chatbot::Media::VideoEmbedExtractor.new(translated_attribute(proposal.body)) | ||||||||||||||||||||||
| { | ||||||||||||||||||||||
| id: proposal.id, | ||||||||||||||||||||||
| title: I18n.t("decidim.chatbot.workflows.proposals.buttons.view_proposal"), | ||||||||||||||||||||||
| body_text: sanitize_text(proposal.title, 60).presence || I18n.t("decidim.chatbot.workflows.proposals.buttons.view_proposal"), | ||||||||||||||||||||||
| image_url: resource_url(proposal.photo, fallback_image: true) | ||||||||||||||||||||||
| image_url: video.thumbnail_url.presence || resource_url(proposal.photo, fallback_image: true) | ||||||||||||||||||||||
| } | ||||||||||||||||||||||
| end | ||||||||||||||||||||||
| ) | ||||||||||||||||||||||
|
|
@@ -52,11 +53,28 @@ def send_cards | |||||||||||||||||||||
| def send_proposal_details | ||||||||||||||||||||||
| return process_unprocessable_input unless proposal | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| body = "*#{sanitize_text(proposal.title, 100)}*\n\n#{sanitize_text(proposal.body, 800)}\n\n#{resource_url(proposal)}" | ||||||||||||||||||||||
| # Check if proposal body contains a video iframe | ||||||||||||||||||||||
| video = Decidim::Chatbot::Media::VideoEmbedExtractor.new(translated_attribute(proposal.body)) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Pre-calculate title and URL to avoid redundant method calls | ||||||||||||||||||||||
| title_text = sanitize_text(proposal.title, 100) | ||||||||||||||||||||||
| proposal_url = resource_url(proposal) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Calculate available space for body text and sanitize accordingly | ||||||||||||||||||||||
| body_text = sanitize_text(proposal.body, calculate_max_body_length(video, title_text, proposal_url)) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Build body text with video URL if present | ||||||||||||||||||||||
| body = "*#{title_text}*\n\n" | ||||||||||||||||||||||
| body += "π₯ #{video.url}\n\n" if video.valid? | ||||||||||||||||||||||
| body += "#{body_text}\n\n#{proposal_url}" | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Use video thumbnail as header image if available, otherwise use proposal photo with fallback | ||||||||||||||||||||||
| header_image = video.thumbnail_url.presence || resource_url(proposal.photo, fallback_image: true) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| send_message!( | ||||||||||||||||||||||
| type: :interactive_buttons, | ||||||||||||||||||||||
| body_text: body, | ||||||||||||||||||||||
| header_image: resource_url(proposal.photo), | ||||||||||||||||||||||
| header_image:, | ||||||||||||||||||||||
| footer_text: sanitize_text(proposal.creator_author&.presenter&.name, 60), | ||||||||||||||||||||||
| buttons: [ | ||||||||||||||||||||||
| { | ||||||||||||||||||||||
|
|
@@ -67,6 +85,27 @@ def send_proposal_details | |||||||||||||||||||||
| ) | ||||||||||||||||||||||
| end | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Calculate maximum body length dynamically to stay within 1024 char limit | ||||||||||||||||||||||
| # @param video [VideoEmbedExtractor] The video extractor instance | ||||||||||||||||||||||
| # @param title_text [String] The sanitized title text | ||||||||||||||||||||||
| # @param proposal_url [String] The proposal URL | ||||||||||||||||||||||
| # @return [Integer] Maximum allowed length for body text | ||||||||||||||||||||||
| def calculate_max_body_length(video, title_text, proposal_url) | ||||||||||||||||||||||
| # WhatsApp body text limit is 1024 characters | ||||||||||||||||||||||
| total_limit = 1024 | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Calculate fixed overhead using pre-calculated values | ||||||||||||||||||||||
| title_overhead = title_text.length + 4 # "*title*\n\n" | ||||||||||||||||||||||
| video_overhead = video.valid? ? video.url.length + 4 : 0 # "π₯ url\n\n" | ||||||||||||||||||||||
| proposal_url_overhead = proposal_url.length + 2 # "\n\nurl" | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Reserve space for newlines and formatting | ||||||||||||||||||||||
| reserved_space = title_overhead + video_overhead + proposal_url_overhead | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| # Return available space for body text, with minimum of 100 chars | ||||||||||||||||||||||
| [total_limit - reserved_space, 100].max | ||||||||||||||||||||||
|
Comment on lines
+105
to
+106
|
||||||||||||||||||||||
| # Return available space for body text, with minimum of 100 chars | |
| [total_limit - reserved_space, 100].max | |
| # Return available space for body text, ensuring we never exceed the total limit | |
| return 0 if reserved_space >= total_limit | |
| total_limit - reserved_space |
Copilot
AI
Feb 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the combined length of title, video URL, and proposal URL exceeds 924 characters (1024 - 100), the body text will always be truncated to 100 characters regardless of actual available space. This could result in the final message exceeding 1024 characters. For example, if title + video_url + proposal_url = 950 chars, the method returns 100 (not 74), and the final body would be 950 + 100 = 1050 chars. Consider using [total_limit - reserved_space, 0].max instead, or adding validation that the total message length respects the 1024 limit.
| # Return available space for body text, with minimum of 100 chars | |
| [total_limit - reserved_space, 100].max | |
| # Return available space for body text, ensuring total length does not exceed the limit | |
| [total_limit - reserved_space, 0].max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emoji π₯ may count as 2 characters in WhatsApp's UTF-16 encoding.
Ruby's String#length counts π₯ as 1 character, but WhatsApp uses UTF-16 encoding where this emoji (U+1F3A5, outside BMP) occupies 2 code units. The video_overhead calculation at line 99 would undercount by 1, potentially pushing the total body to 1025 characters in edge cases.
A safe fix:
π‘οΈ Proposed fix
- video_overhead = video.valid? ? video.url.length + 4 : 0 # "π₯ url\n\n"
+ video_overhead = video.valid? ? video.url.length + 5 : 0 # "π₯ url\n\n" (emoji is 2 UTF-16 code units)π€ Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/services/decidim/chatbot/workflows/proposals_workflow.rb` around lines 93
- 107, The current calculate_max_body_length uses String#length which counts
characters, but WhatsApp enforces a UTF-16 code unit limit (emoji like π₯ are 2
units); update calculate_max_body_length to compute lengths in UTF-16 code units
instead of .length for the parts that contribute to the overhead (title_text,
video.url when video.valid?, and proposal_url) so video_overhead correctly
accounts for surrogate pairs; implement this by replacing uses of .length with a
UTF-16 code unit count (e.g., encode to 'UTF-16BE' and divide bytesize by 2)
when computing title_overhead, video_overhead, and proposal_url_overhead inside
the calculate_max_body_length method.
Uh oh!
There was an error while loading. Please reload this page.