Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] - Changes in logic to delete share db #1706

Merged
merged 11 commits into from
Feb 4, 2025

Conversation

TejasRGitHub
Copy link
Contributor

@TejasRGitHub TejasRGitHub commented Nov 19, 2024

Feature or Bugfix

  • Bugfix

Detail

  • Additional locking when revoking a dataset with the same database shared across various datasets
  • Modifications in checks to see if the principal / database combo is still shared else where

Relates

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Does this PR introduce or modify any input fields or queries - this includes
    fetching data from storage outside the application (e.g. a database, an S3 bucket)? N/A
    • Is the input sanitized?
    • What precautions are you taking before deserializing the data you consume? N/A
    • Is injection prevented by parametrizing queries?
    • Have you ensured no eval or similar functions are used?
  • Does this PR introduce any functionality or component that requires authorization? N/A
    • How have you ensured it respects the existing AuthN/AuthZ mechanisms?
    • Are you logging failed auth attempts?
  • Are you using or adding any cryptographic features? N/A
    • Do you use a standard proven implementations?
    • Are the used keys controlled by the customer? Where are they stored?
  • Are you introducing any new policies/roles/users? N/A
    • Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@TejasRGitHub
Copy link
Contributor Author

TejasRGitHub commented Nov 19, 2024

Test cases:

Onboarded two dataset ( D1, D2 ) with the same common_db. Onboarded a consumption role ( C1 ). Environment has two roles ( Scientist, Engineers )

Scenerio 1:

  1. Created a share S1 with table from C1 to D1 ✅ ( Checked if the tables are accessible via Athena )
  2. Created a share S2 with table from C1 to D2 ✅ ( Checked if the tables are accessible via Athena )
  3. Revoked share S2 and checked if the common_db_shared still exists in the account ✅
  4. Queried tables from S1 to validate access and also checked if LF permissions for the principal exists ✅

Scenario 2:

  1. Repeated steps 1 and 2.
  2. Created another share S3 with D1 from environment role : Scientist ( i.e. creating share within the same environment )
  3. Checked if access to the table exists for this role ✅
  4. Now revoked S1 and check if the access in S3 and S2 are intact ✅ ( Checked that the permission to the principals exists and Athena query works)
  5. Revoked S3 and checked if the access to S2 is present ✅ ( Checked that the permission to the principals exists and Athena query works)
  6. Revoked S2 and checked now that the shared database is deleted ✅

TODO
Scenario 3:

  1. Repeat steps 1 and 2,
  2. Onboard another dataset (D3) in another enviironment where the database name is the same i.e. common_db
  3. Now make a share S1 from C1 to this dataset ( D3 ) and also make a share S2 with D2 or D1
  4. Revoke share items in share S2 and check if the common_shared db is still present ✅

Copy link
Contributor

@dlpzx dlpzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes requested in https://github.com/data-dot-all/dataall/pull/1706/files#r1933588925 need to be fixed. I have not completed the full review because those changes might impact the whole implementation

@TejasRGitHub
Copy link
Contributor Author

The changes requested in https://github.com/data-dot-all/dataall/pull/1706/files#r1933588925 need to be fixed. I have not completed the full review because those changes might impact the whole implementation

@dlpzx , I have made changes to fix the structure of the code.

There are lot of file which have changed when I did make lint with the newer version of ruff due to which you are seeing a lot of file which are updated. I have marked the files which are a part of this PR to make it easy to review PR

# Conflicts:
#	tests/modules/s3_datasets/test_dataset_glossary.py
Copy link
Contributor

@dlpzx dlpzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only some imports need to be removed. The rest looks good IMO.
I am testing the ruff changes in a separate PR. There are format changes that require testing

# Conflicts:
#	backend/dataall/modules/s3_datasets/aws/athena_table_client.py
#	backend/dataall/modules/s3_datasets_shares/aws/glue_client.py
#	tests/modules/s3_datasets_shares/tasks/test_s3_access_point_share_manager.py
@TejasRGitHub
Copy link
Contributor Author

TejasRGitHub commented Jan 30, 2025

Only some imports need to be removed. The rest looks good IMO. I am testing the ruff changes in a separate PR. There are format changes that require testing

Thanks @dlpzx , removed unused imports and also synced PR branch with. the main. now the ruff formatting changes should not appear.

Also I tested out the code changes in the PR with the same tests cases as mentioned over here - #1706 (comment)

@TejasRGitHub TejasRGitHub requested a review from dlpzx January 30, 2025 17:32
@TejasRGitHub
Copy link
Contributor Author

@dlpzx , I ran make lint and also unit test and everythings passed on my end. Acc to me, I think now this PR is also ready to be merged

Copy link
Contributor

@dlpzx dlpzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one minor comment, the rest looks good~

@TejasRGitHub
Copy link
Contributor Author

Left one minor comment, the rest looks good~

@dlpzx , Updated PR with correction

@dlpzx dlpzx requested a review from SofiaSazonova February 4, 2025 07:18
@dlpzx dlpzx merged commit 00fcc93 into data-dot-all:main Feb 4, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants