Skip to content

Conversation

cglukas
Copy link

@cglukas cglukas commented Sep 15, 2025

Purpose

The built-in search is capable of excluding search terms. Thats a great feature which would make the search a lot better!
Unfortunately, there are two blocking components:

  • The splitQuery will discard hyphens which define the excluded terms
  • The performTermsSearch will abort the search if any excluded term is matched

References

Closes #13892

.filter((term) => term); // remove remaining empty strings
var splitQuery = (query) => {
const consecutiveLetters =
/[\p{Letter}\p{Number}_\p{Emoji_Presentation}]+/gu;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK if it's bad to define is regex first and reference it later in the new regex. It could also be plain text. But I thought this way, the origin of the regex can be easier understood.

})
)
break;
continue;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the most important change!

@cglukas
Copy link
Author

cglukas commented Sep 15, 2025

Regarding the CI: I don't see any changes on my PR which would trigger the current CI fail. Is this a common issue? TBH it does not look like it's affecting the master branch too 🤔. I'm a little aimless what to do here.

@jayaddison
Copy link
Contributor

Regarding the CI: I don't see any changes on my PR which would trigger the current CI fail. Is this a common issue? TBH it does not look like it's affecting the master branch too 🤔. I'm a little aimless what to do here.

That's OK, yep - I believe that is due to bug #13886 (in progress, potentially to be fixed by #13883).

@jayaddison
Copy link
Contributor

A delayed thought here: adding the exclusion operator to hyphenated query terms could cause unexpected results.

For example, the query example -test-case currently parses to ["example", "-test", "case"], I think.

Comment on lines +182 to +195
var splitQuery = (query) => {
const consecutiveLetters =
/[\p{Letter}\p{Number}_\p{Emoji_Presentation}]+/gu;
const searchWords = new RegExp(
`(${consecutiveLetters.source})|\\s(-${consecutiveLetters.source})`,
"gu",
);
return Array.from(
query
.matchAll(searchWords)
.map((results) => results[1] ?? results[2]) // select one of the possible groups (e.g. "word" or "-word").
.filter((term) => term), // remove remaining empty strings.
);
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time trying to find an equivalent that is more minimal in terms of lines-of-code / characters-of-code changed.

The following isn't hugely readable -- it's a complex regex, but it essentially enables splits on the - character, provided that a lookbehind for whitespace fails.

In other words: adds - as a split boundary, but only if it is found within a word.

Suggested change
var splitQuery = (query) => {
const consecutiveLetters =
/[\p{Letter}\p{Number}_\p{Emoji_Presentation}]+/gu;
const searchWords = new RegExp(
`(${consecutiveLetters.source})|\\s(-${consecutiveLetters.source})`,
"gu",
);
return Array.from(
query
.matchAll(searchWords)
.map((results) => results[1] ?? results[2]) // select one of the possible groups (e.g. "word" or "-word").
.filter((term) => term), // remove remaining empty strings.
);
};
var splitQuery = (query) =>
query
.split(/(?<!\s)[-]|[^\p{Letter}\p{Number}\-_\p{Emoji_Presentation}]+/gu)
.filter((term) => term); // remove remaining empty strings

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk how to apply this suggestion on the mobile app 😅. Will do it at home.

Comment on lines +643 to +652
[...excludedTerms].some((excludedTerm) => {
// Both mappings will contain either a single integer or a list of integers.
// Converting them to lists makes the comparison more readable.
let excludedTermFiles = [].concat(terms[excludedTerm]);
let excludedTitleFiles = [].concat(titleTerms[excludedTerm]);
return (
excludedTermFiles.includes(file)
|| excludedTitleFiles.includes(file)
);
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[...excludedTerms].some((excludedTerm) => {
// Both mappings will contain either a single integer or a list of integers.
// Converting them to lists makes the comparison more readable.
let excludedTermFiles = [].concat(terms[excludedTerm]);
let excludedTitleFiles = [].concat(titleTerms[excludedTerm]);
return (
excludedTermFiles.includes(file)
|| excludedTitleFiles.includes(file)
);
})
[...excludedTerms].some(
(term) =>
terms[term] === file
|| titleTerms[term] === file
|| (terms[term] || []).includes(file)
|| (titleTerms[term] || []).includes(file),
)

Do we need this set of excludedTerms filtering changes? I would suggest not modifying these lines unless strictly necessary. Tests continue to pass when I revert this.

I acknowledge that the break to continue fixup is important though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll need to add another test then. The last conditions raise an error in some cases. I'll add it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Word exclusion breaks search

3 participants