-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle nested divs in first_item_content #3348
Conversation
e0ba271
to
07c7fd6
Compare
|
||
until element.name == "h2" | ||
first_item_text += element.text if element.name.in?(allowed_elements) | ||
allowed_elements = %w[p ul ol div] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
div
is checked separately in line 53, so adding it here doesn't change anything and is confusing, so it can be removed from this line. So line 55 can be also changed to not exclude it.
|
||
until element.nil? || element.name == "h2" | ||
if element.name == "div" | ||
element.children.each do |child| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first part of if
only checks a single level of divs, so it won't find paragraphs inside nested divs e.g.:
<div>
<div>
<div>
<p>
text...
</p>
</div>
</div>
</div>
It might solve the immediate problem with the particular document, but the problem will reappear when someone adds a document with more deeply nested divs.
If we want to find all nested paragraphs, a recursive solution would work, but is it what we want to do in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense - I'll change this so I look for more deeply nested divs as like you said I only considered the first level 😅
07c7fd6
to
72e43f1
Compare
Updated the first_item_content method so that is processes child elements within nested <div> tags. This ensures that valid content, such as <p>, <ul> and <ol> elements inside deeply nested <div> structures is included when evaluating the content length. The method can now determine whether there is enough content to render a content list. This addressed an issue where it failed to display a list due to missing content within nested <div> elements.
72e43f1
to
48ca52c
Compare
end | ||
end | ||
|
||
def extract_nested_content(element) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is some nice fancy code!! I think we should check that this is behaviour we actually want though first. I'm wondering if it might result in longer than expected contents lists as we nibble through every element on the page. Do you happen to have an example of a page with very nested content so we can look at the before an after?
@CodeSonia I think this PR can be closed right? Could you link to the replacement when you close it. |
Yes that's correct! Once I've got the tests cases written for doc collections and PR description, I'll close this! |
This PR can be closed as the tests have been refactored and a new PR has been generated here: #3361 |
What
Trello card
This PR addresses an issue where the content list was not displayed on some publication pages due to the way nested content elements were handled in the
show_contents_list?
method.Context:
The
show_contents_list?
method uses various conditions to determine whether to display a content list. These rules were first introduced in PR #719.Recently, a Zendesk ticket reported that the content list was missing from the DVSA Earned Recognition page page, resulting in an empty block of space on the left side of the publication.
The issue was traced to the method's ability to handle deeply nested content inside
div
elements.Debugging:
During investigation, the following issues were identified on that page:
However, this is not accurate, because the content includes multiple headings and subsections:
h2
section for "Driver system providers"h3
subsections like "A", "B", and "C", each containingp
elements with contact details, all wrapped inside nesteddiv
tags.The
show_contents_list?
method wasn't picking up this content due to the nested structure. As a result, the conditions returnedfalse
, causing the content list to be omitted.Solution:
I updated the
first_item_content
method incontents_list.rb
to handle nesteddivs
and introduced other methods to extract valid elementsp
,ul
, andol
inside a nesteddiv
element, ensuring that all content is considered.Tests:
Several test cases have been added:
div
elements.divs
.Testing:
The page now correctly renders the content list:
Why
Without this fix, pages with only two content items and no recognised long content could fail to render the content list, leaving an empty whitespace that negatively impacts user experience.
Visual Changes
Follow these steps if you are doing a Rails upgrade.