Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Migrating existing OpenSearch plugins from the main OpenSearch repository into their own dedicated repositories #17246

Open
pgtgrly opened this issue Feb 4, 2025 · 4 comments
Labels
Meta Meta issue, not directly linked to a PR untriaged

Comments

@pgtgrly
Copy link
Contributor

pgtgrly commented Feb 4, 2025

Please describe the end goal of this project

Glossary

  • OpenSearch server Version: The release version of OpenSearch server without any plugins.
  • Plugin Version: The release version of plugins, this is independent of OpenSearch server version.
  • Managed Plugins: Plugins developed and released by OpenSearch Project
  • Core Repository Plugins : Managed Plugins in the OpenSearch core repository (eg analysis-nori)
  • Default Plugins : Managed Plugins that are bundled with standard OpenSearch distribution (eg security plugin).

Motivation

This document proposes, migrating existing plugins in core OpenSearch repository into their own dedicated repository

Background

In addition to the background in https://github.com/opensearch-project/OpenSearch/issues/17127, There are a few more things we need to dive into before we discuss the problem statement.

Default and Core repository plugins

Currently the core OpenSearch repository contains 28 plugins:

  1. analysis-icu
    
  2. analysis-kuromoji
    
  3. analysis-nori
    
  4. analysis-phonenumber
    
  5. analysis-phonetic
    
  6. analysis-smartcn
    
  7. analysis-stempel
    
  8. analysis-ukrainian
    
  9. cache-ehcache
    
  10. crypto-kms
    
  11. discovery-azure-classic
    
  12. discovery-ec2
    
  13. discovery-gce
    
  14. identity-shiro
    
  15. ingest-attachment
    
  16. mapper-annotated-text
    
  17. mapper-murmur3
    
  18. mapper-size
    
  19. repository-azure
    
  20. repository-gcs
    
  21. repository-hdfs
    
  22. repository-s3
    
  23. store-smb
    
  24. telemetry-otel
    
  25. transport-grpc
    
  26. transport-nio
    
  27. transport-reactor-netty4
    
  28. workload-management
    

Whereas the standard opensearch distribution contains the following 23 plugins:

  1. opensearch-alerting
    
  2. opensearch-anomaly-detection
    
  3. opensearch-asynchronous-search
    
  4. opensearch-cross-cluster-replication
    
  5. opensearch-custom-codecs
    
  6. opensearch-flow-framework
    
  7. opensearch-geospatial
    
  8. opensearch-index-management
    
  9. opensearch-job-scheduler
    
  10. opensearch-knn
    
  11. opensearch-ml
    
  12. opensearch-neural-search
    
  13. opensearch-notifications
    
  14. opensearch-notifications-core
    
  15. opensearch-observability
    
  16. opensearch-performance-analyzer
    
  17. opensearch-reports-scheduler
    
  18. opensearch-security
    
  19. opensearch-security-analytics
    
  20. opensearch-skills
    
  21. opensearch-sql
    
  22. opensearch-system-templates
    
  23. query-insights
    

There is no overlap between the plugins that are part of the standard distribution plugins and the plugins which are present in the core repository.

Most of the plugins present in core repository, are not subject to active changes after initial development and go through dependency version bumps that need to be executed by the core maintainers.
These core plugins are not separately released in maven and require the consumers to build it by themselves to install in their OpenSearch cluster. The core plugins lack documentation as well.

Plugin artifact publication

Managed plugins are published in multiple repositories:

  • OpenSearch Managed repositories: Repositories like artifacts.opensearch.com and ci.opensearch.com. However it is not possible to browse these repositories to navigate to a plugin zip. One has to know the exact URL to be able to download the plugin.
  • Maven: Only the default managed plugin zip [example] is published in maven. Core repository zips are not hosted there [example]

Previous discussion

reta published a RFC in 2021 to migrate the plugins out of the core repository #1754. He did a PoC as well where he moved the plugins out of the core repository into a single repository as well. In Issue 7962 It was called out that moving the plugins out of core repository would removed multiple 3P dependencies from the core repo.

Problem Statement

The current inclusion of 28 plugins within the OpenSearch core repository creates several operational challenges. These plugins lack active maintenance, proper documentation, and Maven publication, requiring users to build them manually. Updates to these plugins, primarily dependency version bumps, unnecessarily burden core maintainers due to their tight coupling with the core repository, which also prevents independent versioning and release management. Additionally, the presence of plugin-specific third-party dependencies bloats the core repository. This RFC proposes migrating these plugins to dedicated repositories to improve maintainability, documentation, and release processes while reducing complexity in the core repository.

Proposed Approach

We propose migrating the all the plugins from the core repository to their own repository. The migration process will begin by identifying owners for each plugin who will be responsible for their respective repositories. New dedicated repositories will be created following the naming convention 'opensearch-project/opensearch-' and will include standard components such as Apache 2.0 LICENSE, Code of Conduct, CONTRIBUTING.md, GitHub Actions CI for tests and scanning, and comprehensive README documentation. The code migration will preserve Git history using git subtree split, replace existing plugin folders in the core repository with redirecting READMEs, and update official documentation accordingly. Each repository will have its own CI/CD infrastructure for running integration tests, enabling independent releases that align with core major versions, and maintaining plugin-specific documentation. This migration can proceed independently of the OpenSearch 3.0 release timeline, starting with a single plugin to establish a standard operating procedure for subsequent migrations.

Pros compared to the status quo:

Cons:

  • Will require additional head count to ensure maintenance and release of each plugin repository

Alternatively, we can just migrate the plugins to a single separate repository as reta proposed. This will remove the bloatware from the core repository

FAQ

Q. What will the plugins folder in core repository be used for once these plugins are migrated out?
A. It can be used as a step in migrating features from plugins to core opensearch (like security), it can also contain example and temples for custom plugins.

Q. How will this affect the release process for the core OpenSearch distribution?
A. This will not impact the standard distribution as none of the core repo plugins are shipped with standard distribution

@pgtgrly pgtgrly added Meta Meta issue, not directly linked to a PR untriaged labels Feb 4, 2025
@andrross
Copy link
Member

andrross commented Feb 4, 2025

Just one note...the existing issue #1754 was focused only on the repository-xxx plugins, of which there are 4 in this repo. This is the first issue I've seen suggesting that we move all plugins out of this repository.

Now don't get me wrong, I'm a huge fan of doing anything to reduce the amount of code being maintained here (see #16887), but there are a lot of plugins here that are essentially a single class. Examples include the analysis-xxx and mapper-xxx plugins. I suspect the small plugins with zero/minimal additional dependencies are likely not worth the overhead of a dedicated repository. I wonder if there is a middle ground where the heavyweight plugins could be moved out, thereby providing most of the benefits of reducing bloat and maintenance burdens in this repository, while also reducing the additional overhead. It might also be worthwhile to consider grouping plugins in separate repositories, such as putting repository-s3, discovery-ec2, and crypto-kms plugins together in a repo. All depend on the AWS SDK and maintainers are very likely to be the same individuals.

Q. How will this affect the release process for the core OpenSearch distribution?
A. This will not impact the standard distribution as none of the core repo plugins are shipped with standard distribution

I don't think this is accurate. The bundled plugins today are special in that they can be installed by name. If I extract a distribution tarball, I can then run ./bin/opensearch-plugin install analysis-icu to install the analysis-icu plugin. I think this will not work for a non-bundled plugin.

@peterzhuamazon
Copy link
Member

Infra side this is doable but we do need to understand that:

  1. All new plugins also need to have separate integTest onboard to Jenkins, which is already causing a lot of resources to run now on the existing 23 plugins.
  2. We will not be able to install plugins with opensearch-plugin install repository-s3 directly if they are not core-plugins anymore. As non-core-plugins as of now does not have the mechanic to do so. User would need to manually download the zip and install, or we need to bundle them like the other 23 plugins.
  3. Adding above 2 would require changing the default install url logic in core + changing folder structure of the existing bundled plugins, which is a pretty big breaking change I would say.
  4. Owners of these new plugins would need to participate in the release process from now on.
  5. There are plans to validate core plugin installations at one point from infra side: Validate native plugin installation opensearch-build#2859
  6. I agree with @andrross that we might be able to combine similar functioned plugins into a single repo or plugin scope to reduce overhead.

cc @getsaurabh02

Thanks.

@lukas-vlcek
Copy link
Contributor

@peterzhuamazon Just a quick comment on point 2)
OpenSearch plugins can be published into maven https://opensearch.org/blog/opensearch-plugin-zips-now-in-maven-repo/ and since this update the groupId can be fully customized which means that opensearch-plugin utility could be extended to search over (pre-)configured MVN repositories as well (which might not be seen as a breaking change IMO).

@peterzhuamazon
Copy link
Member

@peterzhuamazon Just a quick comment on point 2) OpenSearch plugins can be published into maven https://opensearch.org/blog/opensearch-plugin-zips-now-in-maven-repo/ and since this update the groupId can be fully customized which means that opensearch-plugin utility could be extended to search over (pre-)configured MVN repositories as well (which might not be seen as a breaking change IMO).

Right, tho we are currently still pulling from S3 instead of maven for released core plugin installation.
We can definitely try to use maven but that requires each plugin repo to maintain a corresponding build task carefully.
Also maven is not as easy to manage as S3 so I do have a bit concern on that front. cc @prudhvigodithi .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta Meta issue, not directly linked to a PR untriaged
Projects
Status: New
Development

No branches or pull requests

4 participants