Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash 3.4.0 - libxml2.2.dylib: xmlMallocZero +48 #853

Closed
BPerlakiH opened this issue Jul 10, 2024 · 4 comments · Fixed by #865
Closed

Crash 3.4.0 - libxml2.2.dylib: xmlMallocZero +48 #853

BPerlakiH opened this issue Jul 10, 2024 · 4 comments · Fixed by #865
Assignees
Milestone

Comments

@BPerlakiH
Copy link
Collaborator

Screenshot 2024-07-10 at 11 30 28
@BPerlakiH BPerlakiH self-assigned this Jul 10, 2024
@kelson42 kelson42 added this to the 3.4.1 milestone Jul 10, 2024
@BPerlakiH
Copy link
Collaborator Author

BPerlakiH commented Jul 13, 2024

@kelson42 I had a deeper look at this.
We have a dependency called Fuzi, which is a swift wrapper around libxml2.
It is linking the system provided version of libxml2 at build time.

There main issue is that it won't fail properly if initialised with invalid xml/html content data, there's already a long standing issue on that on Fuzi's side.

I implemented the proposed changes for that, and linked my version of Fuzi to the project, but it's still not 100%, as it is not failing properly under all conditions. Additional to that some issues have been already fixed in the underlying libxml2 as well. comparing my local version (2.09) and the latest libxml2 version 2.13.2.

We have a couple of options here, bearing in mind that only 2 devices crashed:

  • A) try to enforce the latest version of libxml2 by manually adding it and linking it, and use a fork of Fuzi for that
  • B) switch Fuzi and libxml2 to something else (quality of 3rd party packages might vary, so there's no guarantee it will be an actual improvement)
  • C) we use the html parsing in 2 places, the search and the bookmark. For the bookmarks there's already a proposition to remove the html parsing. We can do the same for search. The difference is:

Search results with default html parsing option (first sentence):

Screenshot 2024-07-13 at 13 01 35

Search results with html parsing disabled:

Screenshot 2024-07-13 at 13 01 20

We can be more subtle here, either:

  • C1) disable this functionality in the next release, so the html parsing won't be used, if we find a better replacement for it we can re-enable it.
  • C2) Remove html parsing completely, including the feature flag and settings of this.
  • D) Postpone this fix

@kelson42
Copy link
Contributor

@BPerlakiH Thank you for the analysis. Please make a PR to remove for bookmarks. For search results, the libkiwix has a function to provide the HTML snippet, so I don't understand why this is not used!

@BPerlakiH
Copy link
Collaborator Author

BPerlakiH commented Jul 13, 2024

Thank you @kelson42.
Here's the updated PR to remove parsing from bookmarks: #830.
As part of the issue then, I will look into how we could use the html snippet from libkiwix, instead of this parsing.

@BPerlakiH
Copy link
Collaborator Author

Ok, I figured this out. We have 4 options to display some extra field as part of the search results.
It can be:

  • Disabled
  • First paragraph
  • First sentence
  • Matches

First paragraph and First sentence is using the html parser to get some more info from the content.
Matches, is actually using what you were referring to @kelson42 , which is the matching snippet from libzim.
If we turn on "matches", the results may vary, as we have 2 types of search results, one where the title is matching, one is matching the indexed content:
Screenshot 2024-07-13 at 21 35 49

As you can see, where the title matches there's no additional info, whereas where the content is matching there's an extra field.

This is is to be compared with the default behaviour (First Paragraph):
Screenshot 2024-07-13 at 21 37 45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants