Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate the LegacyBM25Similarity class and default to BM25Similarity #17315

Open
prudhvigodithi opened this issue Feb 11, 2025 · 1 comment · Fixed by #17306
Open

Deprecate the LegacyBM25Similarity class and default to BM25Similarity #17315

prudhvigodithi opened this issue Feb 11, 2025 · 1 comment · Fixed by #17306
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance v3.0.0 Issues and PRs related to version 3.0.0

Comments

@prudhvigodithi
Copy link
Member

Is your feature request related to a problem? Please describe

Coming from LegacyBM25Similarity and from @msfroh comment here #17241 (comment), the class already includes a note advising users to use Lucene's BM25Similarity. Since the work for the 3.0.0 release has begun, this is the perfect opportunity to default the scoring to BM25Similarity.

The change is focused on moving from a legacy implementation to the current Lucene BM25Similarity, which provides a cleaner and more standardized way scoring. The switch from LegacyBM25Similarity to BM25Similarity doesn't change the core ranking/scoring behavior in a way that would significantly impact search results.

Describe the solution you'd like

Default to Lucene's BM25Similarity, while allowing users to choose LegacyBM25Similarity if required.
Example

curl -X PUT "http://localhost:9200/test-index" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "similarity": {
        "default": {
          "type": "LegacyBM25",
          "k1": 1.2,
          "b": 0.75
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text"
      }
    }
  }
}
'

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

@prudhvigodithi prudhvigodithi added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 11, 2025
@prudhvigodithi prudhvigodithi self-assigned this Feb 11, 2025
@prudhvigodithi prudhvigodithi changed the title Deprecate the LegacyBM25Similarity class Deprecate the LegacyBM25Similarity class and default to BM25Similarity Feb 11, 2025
@prudhvigodithi
Copy link
Member Author

[Triage]
I have a draft PR created #17306 to the required change, please check.
Thanks.
@getsaurabh02 @msfroh

@prudhvigodithi prudhvigodithi added v3.0.0 Issues and PRs related to version 3.0.0 and removed untriaged labels Feb 11, 2025
@prudhvigodithi prudhvigodithi moved this from Todo to In Progress in Performance Roadmap Feb 12, 2025
@prudhvigodithi prudhvigodithi linked a pull request Feb 12, 2025 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance v3.0.0 Issues and PRs related to version 3.0.0
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

1 participant