Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed CatShardsAction to internal to allow non-admin users #17203

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aasom143
Copy link
Contributor

@aasom143 aasom143 commented Jan 30, 2025

Description

With introduction of CatShardsAction, the _cat/shard call requires extra cluster permission cluster:monitor/shards for a non-admin user to call the API which was not required for versions prior to 2.17. We need to fix this to retain the old behaviour

(failed -{"root_cause":[{"type":"security_exception","reason":"no permissions for [cluster:monitor/shards] and User [name=regular, backend_roles=[], requestedTenant=null]"}],"type":"security_exception","reason":"no permissions for [cluster:monitor/shards] and User [name=regular, backend_roles=[], requestedTenant=null]"},"status":403}'})

Related Issues

Resolves #[#17199 ]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 7441e4f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Member

@cwperks cwperks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should all users be able to perform this action? Is this exposed by a REST API?

There is a static cluster_monitor action group to provision access to actions like this: https://github.com/opensearch-project/security/blob/main/src/main/resources/static_config/static_action_groups.yml#L153

@aasom143
Copy link
Contributor Author

Why should all users be able to perform this action? Is this exposed by a REST API?

Customers can create non-admin users, but they are currently unable to perform cat shards actions due to this issue, which violates the behaviour prior to version 2.17. This change will restore the expected behaviour.

@cwperks
Copy link
Member

cwperks commented Jan 30, 2025

Why should all users be able to perform this action? Is this exposed by a REST API?

Customers can create non-admin users, but they are currently unable to perform cat shards actions due to this issue, which violates the behaviour prior to version 2.17. This change will restore the expected behaviour.

The user gets access to perform actions through roles mappings. You would need to ensure that one of the mapped roles covers this action.

This action should remain under the cluster:monitor/* category of action.

@aasom143
Copy link
Contributor Author

Why should all users be able to perform this action? Is this exposed by a REST API?

Customers can create non-admin users, but they are currently unable to perform cat shards actions due to this issue, which violates the behaviour prior to version 2.17. This change will restore the expected behaviour.

The user gets access to perform actions through roles mappings. You would need to ensure that one of the mapped roles covers this action.

This action should remain under the cluster:monitor/* category of action.

This was earlier exposed by REST endpoint and we don't want users to be breaking if they are already using it w/o permissions hence moving it to internal.

@shwetathareja
Copy link
Member

@aasom143 RestShardsAction calls TransportClusterStateAction which is "cluster:monitor/state" action. How are users able to call _cat/shards prior to OPenSearch 2.17 without permission to cluster:monitor/*?

@aasom143
Copy link
Contributor Author

@aasom143 RestShardsAction calls TransportClusterStateAction which is "cluster:monitor/state" action. How are users able to call _cat/shards prior to OPenSearch 2.17 without permission to cluster:monitor/*?

We had given cluster:monitor/state permissions to non-admin user due to which they were able to make the call. When i removed the permission cluster state action was failing. In 2.17 we have created internal transport action due to which we need make it internal so that non-admin user don't need to provide the permission explicitly.

Copy link
Member

@shwetathareja shwetathareja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the change as this API (_cat/shards) didn't have explicit permissions earlier as well. The permission for individual sub API calls (_cluster/state and {index_name}/stats) that it makes are sufficient.

@shwetathareja shwetathareja added backport 2.x Backport to 2.x branch Cluster Manager labels Feb 3, 2025
Copy link
Contributor

github-actions bot commented Feb 3, 2025

❌ Gradle check result for 5ea6507: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 3, 2025

❌ Gradle check result for c81b1c6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 3, 2025

❌ Gradle check result for a99c69f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@cwperks
Copy link
Member

cwperks commented Feb 3, 2025

@aasom143 @shwetathareja I am not following what has changed for this action and why it needs to be marked internal. Can you elaborate on the reason for this change?

Signed-off-by: Somesh Gupta <guptasom@amazon.com>
@aasom143
Copy link
Contributor Author

aasom143 commented Feb 3, 2025

@aasom143 @shwetathareja I am not following what has changed for this action and why it needs to be marked internal. Can you elaborate on the reason for this change?

In version 2.17, we introduced the Transport action for cat shards, which admin users are permitted to call, while non-admin users do not have this permission. To address this, we are making the cat shards action internal, allowing non-admin users to also make the call. It's important to note that the CatShardsAction was not available in version 2.15, it was introduced starting from version 2.17.

@cwperks cwperks dismissed their stale review February 3, 2025 18:52

Dismissing as the ClusterState check will be performed by the new transport action

Copy link
Contributor

github-actions bot commented Feb 3, 2025

✅ Gradle check result for 7b39dfe: SUCCESS

Copy link

codecov bot commented Feb 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.29%. Comparing base (bcf646d) to head (7b39dfe).

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #17203      +/-   ##
============================================
- Coverage     72.46%   72.29%   -0.17%     
+ Complexity    65782    65686      -96     
============================================
  Files          5318     5318              
  Lines        305674   305674              
  Branches      44349    44349              
============================================
- Hits         221514   221002     -512     
- Misses        66024    66551     +527     
+ Partials      18136    18121      -15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cwperks
Copy link
Member

cwperks commented Feb 3, 2025

I still think the cluster name should remain cluster:monitor/shards given that its a cluster action exposed by a REST API. internal should only be reserved through truly internal actions.

I would consider making a change in the security plugin to pass-through for this action...similar to how the security plugin treats internal requests.

Edit: Pass-through for internal requests is found here

@shwetathareja
Copy link
Member

I would consider making a change in the security plugin to pass-through for this action...similar to how the security plugin treats internal requests.

@cwperks : you are suggesting to bypass permission via security plugin for cluster:monitor/shards. I don't have major concerns with that, the only thing would be any community user which is not using opensearch security plugin may see some impact. We would need to call this out explicitly in documentation.

@cwperks
Copy link
Member

cwperks commented Feb 4, 2025

One thing I would consider is making this an intentional breaking change in 3.0.0 and then surround the internals of the transport action similar to this. When a _cat/shards call would be authorized, the user would need to have cluster:monitor/shards permission on one of the mapped roles.

If you go along the lines of the change in this PR, you can consider an exit early strategy before here. i.e.

if (request instanceof CatShardsRequest) {
    // This is a shortcut for CatShardsAction, privileges evaluation will be performed at a lower level
    return PrivilegesEvaluatorResponse.ok();
}

@cwperks : you are suggesting to bypass permission via security plugin for cluster:monitor/shards. I don't have major concerns with that, the only thing would be any community user which is not using opensearch security plugin may see some impact. We would need to call this out explicitly in documentation.

Without the security plugin there is no authorization, how would users w/o security be impacted?

@shikharj05
Copy link

@aasom143 If a new action is being added, it's recommended to have a permission associated with it as well. If this is changed to internal, this could potentially lead to a by-pass in future which is not ideal. I think it's better to document the change and add the required permission for clients using the API.

@shwetathareja
Copy link
Member

@cwperks : To add more context here. From user perspective, nothing has changed. It is internal code refactoring. Earlier all the code for _cat/shards API was directly in RestShardsAction class. Now, with the refactoring (to add cancellation support) the transport class was added for it. And, new transport class requires permission to be added. I am fine making it a breaking change in 3.0 but for 2.x user, they shouldn't need to define any extra permission and should be status quo for them.

Without the security plugin there is no authorization, how would users w/o security be impacted?

Users could have their own security plugin implementation but they would not have handled explicit permission for cluster:monitor/shards action as it never existed.

@cwperks
Copy link
Member

cwperks commented Feb 6, 2025

@cwperks : To add more context here. From user perspective, nothing has changed. It is internal code refactoring. Earlier all the code for _cat/shards API was directly in RestShardsAction class. Now, with the refactoring (to add cancellation support) the transport class was added for it. And, new transport class requires permission to be added. I am fine making it a breaking change in 3.0 but for 2.x user, they shouldn't need to define any extra permission and should be status quo for them.

Understood, I see the regression. I'm coming at this from the perspective of how I think it should ideally be because on the surface this looks like a hack.

In general, any RestAction is typically tied 1-to-1 with a TransportAction. The transport action has a name and that is what's authorized. Its basically the same as saying, this user has access to perform the _cat/_shards API. In practice, user's are typically assigned access to a bundle of APIs through an action group (for example cluster_monitor).

internal: should be reserved for actions that are not exposed by a REST API and only called while handling a separate transport action.

In this case, there was a new transport action created which introduced the regression and broke prior behavior. Adding internal resolves the issue, but it breaks with the convention of naming the transport actions with cluster: or indices: if they are directly exposed by a REST API.

If you want to get the previous behavior we can certainly do this for the 2.x line, but I wonder if we should take the opportunity in 3.0.0 to say that you now must also need the cluster:monitor/shards permission.

For the internals of the transport action, you can wrap them in try (ThreadContext.StoredContext ctx = threadContext.stashContext()) { ... } to bypass any authz for any children transport actions (i.e. the cluster state actions) that are executed as part of the doExecute of TransportCatShardsAction. Sample reference PR: #17250

Users could have their own security plugin implementation but they would not have handled explicit permission for cluster:monitor/shards action as it never existed.

Its part of the implementation of the security plugin where it opts to skip authz for transport actions that are prefixed with internal:

Leave it up to any security plugin impl to decide how actions are authorized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch Cluster Manager
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

6 participants