-
Notifications
You must be signed in to change notification settings - Fork 483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add C++ API for Columnar Encryption #1183
Comments
cc @wgtmac , @williamhyun , @guiyanakuang , @stiga-huang |
I agree with you, @guiyanakuang . It will take lots of efforts. |
+1 @guiyanakuang @dongjoon-hyun. This is a large feature. We can follow the java implementation and break it down into smaller work items first. |
Thank you, @wgtmac . |
I set a milestone 1.9.0 as a goal. |
Hi @dongjoon-hyun Do we have jira to track the work? |
@deshanxiao Not yet. @coderex2522 is working on it already. Will create JIRAs later. |
@coderex2522 Has the ORC file encryption and decryption functionality been implemented? |
@dongjoon-hyun @coderex2522 I want to try implementing ORC encryption and decryption feature in C++. Are there any other developers working on this feature? |
@coderex2522 Thank you for your support. I would like to know more about the design plan for ORC encryption function. Do you have any materials that you can provide me with? Below is a document I have summarized by reading the Java source code. I'm not sure if it's correct. |
@zxf216 If you are interested in this part of the work, welcome to provide the implementation of the C++ encryption module to the Apache ORC community! Afterward, @dongjoon-hyun @wgtmac @deshanxiao can help with a thorough review of this part. |
Thanks @zxf216 for taking up this! AFAIK, there aren't many materials for the design except the spec and the Java code. Let me spend some time to recall the detail. Feel free to discuss it in detail if you want. |
@wgtmac @coderex2522 My email is 524627843@qq.com. Can you share your email or WeChat with me? This way we can communicate more efficiently. |
You can find me via gangwu@apache.org |
This is removed from |
@dongjoon-hyun @wgtmac @coderex2522
Implement DecryptionInputStream class:
Used in StripeStream.cc:
Main function to read a file
there is an error when reading encrypted columns Caught exception in : Entry index out of range in StringDictionaryColumn
|
@zxf216 This error happens when reading dictionary-encoded string values. Please check following:
|
@wgtmac I found in the Java version of ORC code that IV is updated during seeking. I will study this first. |
Solution @wgtmac
The changeIv method is defined as follows.
|
ORC Decryption Read I have already mentioned StarRocks; if there are no issues, we will consider mentioning the ORC side later. If there are users who need this feature, we can give them this PR StarRocks/starrocks#46809 as a reference. We have conducted internal tests, and now it's undergoing normal online gray-scale testing. |
Great job! Looking forward to the contribution to the Apache ORC community. @zxf216 |
This is created based on the following dev@orc mail on June 11st.
https://lists.apache.org/thread/pkp6ffh9pqok7v618zxtox708mv26sz0
The text was updated successfully, but these errors were encountered: