[hive] Fix inconsistent HiveServer2 user impersonation for all query types #4282
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem Statement
When executing Hive queries from Hue, different query patterns showed inconsistent user attribution in Ranger audit logs:
hive
service accountknoxui
)This inconsistency caused Ranger authorization failures when policies were configured for the logged-in user but not the service account.
Root Cause Analysis
The
hive.server2.proxy.user
configuration parameter was only set during session initialization in theopen_session()
method (line 711). However, HiveServer2 requires this parameter to be passed at the statement execution level via theconfOverlay
parameter to ensure consistent impersonation.The
_get_query_configuration()
method only returned user-defined settings from the query object, without including the critical proxy user configuration needed for proper impersonation.Solution
This PR ensures
hive.server2.proxy.user
is explicitly included in every statement execution by modifying three methods:_get_query_configuration()
: Automatically adds proxy user to query configurationexecute_statement()
: Ensures proxy user is set for synchronous executionsexecute_async_statement()
: Ensures proxy user is set for async executionsThe fix checks that the proxy user isn't already present (to avoid overriding user-provided values) and only applies to Hive/Beeswax query servers.
Changes Made
File Modified:
apps/beeswax/src/beeswax/server/hive_server2_lib.py
execute_statement()
execute_async_statement()
_get_query_configuration()
outputTesting
Impact