Skip to content

Remove source database filter limit #1720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 6, 2024

Conversation

JeltevanBoheemen
Copy link
Contributor

@BeritJanssen FYI: User reported not being able to filter all existing corpora. The fix was easy enough to not warrant waiting for the next release.

Current setting is smaller that the amount of sub-corpora.

This branch is currently deployed on the peace-portal server. After including this in the newest release, that server should be reset to master.

@JeltevanBoheemen
Copy link
Contributor Author

@lukavdplas Please inform me when making a release so I can make sure we don't get conflicting migrations on the peace portal erver

@BeritJanssen
Copy link
Contributor

This branch does more than remove the source database limit, it also adds a corpus definition, and integrates the changes of feature/fixed-filter-order. Was this the intention? Or did you mean to just propose a hotfix (intended for the Peace Portal servers) here with minimal changes?

@JeltevanBoheemen
Copy link
Contributor Author

JeltevanBoheemen commented Dec 4, 2024

The branch indeed does more then intended at first:

  • loadcorpora did not work for many of the peaceportal corpora, fixes that
  • Source Database was moved to the bottom of the filterlist, hence the merging of feature/fixed-filter-order

In essence; it is a hotfix to get Peace Portal into a working state with all the current corpora again. I will take care of making a nice relase after the main I-analyzer release is out (today or tomorrow). This branch will probably be discarded, and the relevant changes implemented separately.

@lukavdplas
Copy link
Contributor

This branch included the changes from #1704, which is now merged into develop. So there should be no issues with migrations 👍

@JeltevanBoheemen
Copy link
Contributor Author

@BeritJanssen Could you review these changes?

  • Increased the options counts for most fields to ensure new corpora don't cut off dropdown menus
  • Added preliminary version of the North Africa corpus definition. This can be used to index a working version, but three fields need refinement in the data, and then should be added to the definition. Client will send the updated data soon.

Copy link
Contributor

@BeritJanssen BeritJanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking in the documentation, the limit for the buckets which are returned standard from the Elasticsearch aggregation are 10, which is rather low for most purposes. At the same time, the documentation suggests that fetching up to a 1000 buckets shouldn't be a problem. Perhaps we can set 50 as a more reasonable default and document that if there are more than 50 distinct terms in a field with a MultipleChoiceFilter, the min_count should be increased, up to a limit of 1000? This can be left for later. Now the source databases limit is de facto 10, which is fine for now.

@JeltevanBoheemen JeltevanBoheemen merged commit 9a3810b into develop Dec 6, 2024
1 check passed
@JeltevanBoheemen JeltevanBoheemen deleted the feature/peaceportal-sourcedatabase branch December 6, 2024 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants