Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field display improvements #1710

Merged
merged 11 commits into from
Dec 4, 2024
Merged

Field display improvements #1710

merged 11 commits into from
Dec 4, 2024

Conversation

lukavdplas
Copy link
Contributor

@lukavdplas lukavdplas commented Nov 20, 2024

This makes some minor updates to the components to display fields. Includes Jelte's suggestions in #1708

Also:

  • Improves formatting for keyword fields if the value is an array. (a, b instead of a,b)
  • Dates are now displayed as "Jan 1, 2024".
  • Increases the preview size for text content fields when highlighting is off.

Internal changes:

  • Extracts the logic for content fields to a ContentFieldComponent.
  • Moved some components and pipes that were part of the DocumentModule to /frontend/src/app/document/.
  • Removed some exports from the DocumentModule that were not actually used anywhere else.
  • Removed the unused RegexHighlightPipe.

@lukavdplas lukavdplas added code quality code & performance improvements that do not affect user functionality frontend changes to the angular frontend labels Nov 20, 2024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need the regex-highlight pipe for the following situations:

  • highlighting in the manual
  • highlighting when the Elasticsearch settings don't allow highlighting via ES.

Copy link
Contributor

@BeritJanssen BeritJanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree with the reorganization of everything document-related into the document folder. Thanks! Also good to have a keyword pipe to create better formatting of the keyword content. Just a double-check: this will only affect the display of the keyword field, and not affect the internal representation, so that no spaces will be added to the route, right?

I do not agree with deleting the regex pipe. I see that indeed the document views don't use it anymore (though I think there were still cases where Elasticsearch highlight is not an option, something to discuss IRL), but for highlighting search terms in the manual it should still be relevant.

@lukavdplas
Copy link
Contributor Author

lukavdplas commented Nov 21, 2024

I think there were still cases where Elasticsearch highlight is not an option

I looked through the highlighting documentation to figure out what kind of cases you could be referring to, but I can't find any. To my surprise, even a text mapping is not a strict requirement.

As far as I know, the only potential issue is that that the fvh algorithm has some extra requirements, so you should choose the algorithm based on the field mapping. The implementation of this choice in /frontend/src/app/utils/es-query.ts looks a bit shaky to me, but that's an unrelated issue.

highlighting search terms in the manual it should still be relevant.

What do you mean by "should be relevant"? It's not used (see #1593). If you believe this should be re-introduced, can you make an issue for that?

@BeritJanssen
Copy link
Contributor

Ah, I didn't see #1593, so I assumed that the manual search was still a reality. In that case, it's indeed fine to remove the regex highlighter. The fvh search requires term vectors on the field level (not only in subfields of the multifield), but I believe the necessary reindex operations have been done. And indeed, Elasticsearch highlighting is an option at any rate, but complex queries won't get highlighted then.

@lukavdplas
Copy link
Contributor Author

but complex queries won't get highlighted then.

If you mean things like wildcards or fuzzy matching, if I recall correctly, that was the reason why we stopped using the regex highlighter (which does not support those).

The documentation does suggest that certain types of queries are not fully supported on all algorithms. (fvh doesn't support span queries, and unified is said to support "accurate phrase and multi-term (fuzzy, prefix, regex) highlighting" - suggesting some other algorithms do not.)

This isn't necessarily relevant for us because we only use simple_query_string queries. But any limitations described in the documentation point to unified as the most accurate - which is the algorithm we use if fvh is not supported.

I can't find any example of unified being less accurate for complex queries. Can you give an example of such a query? Or point to the documentation where I can read more about this?

@BeritJanssen
Copy link
Contributor

BeritJanssen commented Dec 4, 2024

I haven't implemented the Elasticsearch highlighting, but I remembered there were some limitations of what could be done without the fvh highlighter. All indices had to be reindexed for the ES highlighting feature is all I remember. Looking back at the documentation, perhaps being able to break the highlighting with the boundary_chars characters was the motivation here. Complex queries seem to be supported in any case, nice! @Meesch , do you remember?

Anyway, fine to merge this.

@lukavdplas lukavdplas merged commit 34261a5 into develop Dec 4, 2024
1 check passed
@lukavdplas lukavdplas deleted the feature/field-display branch December 4, 2024 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code quality code & performance improvements that do not affect user functionality frontend changes to the angular frontend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants