Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ranger plugin #296

Closed
wants to merge 2 commits into from

Conversation

zzzzming95
Copy link
Contributor

@zzzzming95 zzzzming95 commented Sep 28, 2023

📝 Description

overview

Apache Ranger is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. Apache ranger.

There is currently a lack of metadata authentication solutions in the industry. Although Apache ranger provides hiveserver authentication solutions, this has limited support for hive-cli, spark-submit and other scenarios. The solution we designed is to reuse the ranger hiveserver plugin in waggle-dance to achieve permission control on the metadata side.

This solution has the following limitations:

  1. The waggle-dance ranger authentication scheme only achieves table-level granularity and cannot achieve column-level granularity.
  2. The waggle-dance ranger authentication scheme does not support ranger’s advanced features, such as Row level filtering, Data masking, etc., and only implements authentication.
  3. If you have alter, update or drop permissions on a table, you must have select permissions
  4. Currently, only the acquisition of user and group in the Kerberos environment is implemented (other methods need to be expanded in RangerWrappingHMSHandler)

implement

We implement the RangerWrappingHMSHandler like TokenWrappingHMSHandler. Intercept the metastore API request and obtain the db and table fields, and then perform ranger authentication.

Considering that a hive statement may execute multiple API requests (select statements will execute multiple get_table requests), we designed a gauva cache to cache a copy of permission information in memory.

🔗 Related Issues

@zzzzming95 zzzzming95 requested a review from a team as a code owner September 28, 2023 08:15
@zzzzming95
Copy link
Contributor Author

@patduin

Can you take a look on this pr , I will add a document later~ thanks

@massdosage
Copy link
Contributor

massdosage commented Sep 28, 2023

Other than the small change to MetaStoreProxyServer it looks like all of this code can be completely separate. I wonder whether it would be worth moving this to a separate repository which would be a "Ranger plugin" for WaggleDance? This way WD isn't coupled to Ranger and only those who want to use Ranger can include the plugin jar file on the WD classpath. We could then refer to the plugin from the WD documentation so that those interested can find it.

@jmnunezizu
Copy link

jmnunezizu commented Sep 29, 2023

Other than the small change to MetaStoreProxyServer it looks like all of this code can be completely separate. I wonder whether it would be worth moving this to a separate repository which would be a "Ranger plugin" for WaggleDance? This way WD isn't coupled to Ranger and only those who want to use Ranger can include the plugin jar file on the WD classpath. We could then refer to the plugin from the WD documentation so that those interested can find it.

I agree. This is the approach we took with Circus Train and the Big Query plugin (see https://github.com/ExpediaGroup/circus-train-bigquery).

@jmnunezizu jmnunezizu closed this Sep 29, 2023
@zzzzming95
Copy link
Contributor Author

Okay, I'll find out about that later.

thank you for the review . @massdosage @jmnunezizu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants