Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable s3 tagging JVM option #10029

Merged
merged 10 commits into from
Apr 17, 2024

Conversation

ErykKul
Copy link
Collaborator

@ErykKul ErykKul commented Oct 20, 2023

What this PR does / why we need it:
Some S3 implementations do not support tagging. This might prevent the upload-redirect from working, with an error on the x-amz-tagging header. This PR adds an JVM option that allows disabling the tagging.

Which issue(s) this PR closes:

Closes #10022

** Testing - do a direct upload and check in the browser to see that the generated URLs do not have the tag when the flag is true. Direct uploads should work on a store that supports tags regardless of the state of the flag. Unless we have a known S3 store that doesn't support tags available for QA, I think we can skip testing that case (presumably @ErykKul already has in developing this.)

Is there a release notes update needed for this change?:
Yes.

Additional documentation:
The new option is listed in the JVM settings for s3 store section.

@pdurbin pdurbin added the Size: 3 A percentage of a sprint. 2.1 hours. label Feb 28, 2024
Copy link
Member

@qqmyers qqmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I made minor edits to the notes/guides text and added a //STORAGE DRIVER header in the JvmSettings (since this is the first of presumably many such settings - they just haven't been converted yet.).
It's likely that this will be an s3 specific setting which may mean we'll want a dataverse.files.{id}.s3.disable-tagging setting (extra s3 scope) at some point, but I think it is overkill to change it at this point since we haven't done the other storage driver settings yet.

…S#10022

Conflicts:
src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java
@pdurbin pdurbin self-assigned this Apr 16, 2024
@coveralls
Copy link

coveralls commented Apr 16, 2024

Coverage Status

coverage: 20.718% (+0.001%) from 20.717%
when pulling 22e5b1a on ErykKul:10022_upload_redirect_without_tagging
into 131e76c on IQSS:develop.

@pdurbin
Copy link
Member

pdurbin commented Apr 16, 2024

@ErykKul I haven't tested this yet but I worked on the docs a bit. I hope you approve. Among other things, I edited some existing docs and the paragraph a section header: https://dataverse-guide--10029.org.readthedocs.build/en/10029/developers/big-data-support.html#s3-tags

@ErykKul
Copy link
Collaborator Author

ErykKul commented Apr 17, 2024

@pdurbin It looks great! Thanks!

@pdurbin
Copy link
Member

pdurbin commented Apr 17, 2024

@ErykKul @qqmyers in e473d53 I pushed a doc change because I saw this error when I disabled tagging:

<Error>
  <Code>AccessDenied</Code>
  <Message>There were headers present in the request which were not signed</Message>
  <RequestId>25ff2bb0-13c7-420e-8ae6-3d92677e4bd9</RequestId>
  <HostId>9Gjjt1m+cjU4OPvX9O9/8RuvnG41MRb/18Oux2o5H5MY7ISNTlXN+Dz9IG62/ILVxhAGI0qyPfg=</HostId>
  <HeadersNotSigned>x-amz-tagging</HeadersNotSigned>
</Error>

I assume this was the right thing to do. Otherwise, please feel free to revert or further edit.

That is to say, of course, that I can at least tell that something is happening when I disable tagging.

I must say that I don't know how to do this: "do a direct upload and check in the browser to see that the generated URLs do not have the tag when the flag is true". If someone can help, I'm happy to try. Otherwise, I'm ready to merge as I don't think this pull request does any harm.

Something I am able to check is the URL generated from the mvn test -Dtest=S3AccessIT#testDirectUpload. The "before" URL with tagging enabled (JVM option unset) has "x-amz-tagging" in it:

url: http://localstack:4566/mybucket/10.5072/FK2/8HBFB9/18eed42827c-73a9c09a1ac1?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20240417T181223Z&X-Amz-SignedHeaders=host%3Bx-amz-tagging&X-Amz-Expires=3599&X-Amz-Credential=default%2F20240417%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Signature=8e31e272ba6a6f889858b693cc41319bc618e2df3d264ee743b752f974f318ca

The "after" URL (tagging disabled), does not:

url: http://localstack:4566/mybucket/10.5072/FK2/9H00VE/18eed4f74e4-6f2d4788a4c6?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20240417T182631Z&X-Amz-SignedHeaders=host&X-Amz-Expires=3599&X-Amz-Credential=default%2F20240417%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Signature=9dbd1a640b70c3d6970a49ea92eded04e6f57a4bf02947aae81ec1350996c8be

So, I've tested what I understand how to test. I don't actually see the "temp" tag anywhere. If you can explain how, I'm happy to keep digging. Otherwise, like I said, I'm ready to merge. Please let me know.

p.s. This is more than a 3. My bad.

@qqmyers
Copy link
Member

qqmyers commented Apr 17, 2024

What you saw is basically it - Dataverse's job is either to allow that tag to go through or not (so it's not the tag itself in the URL but having that header allowed as part of the call). The temp tag itself is on the object in S3 so you could use the aws cli to go look for it, but if you can't send the tag, it can't be there. Your doc update to not include the header when the flag is off is a good catch.

@pdurbin
Copy link
Member

pdurbin commented Apr 17, 2024

@qqmyers thanks, that's good enough for me. Merging!

@pdurbin pdurbin merged commit 4678330 into IQSS:develop Apr 17, 2024
11 of 12 checks passed
@pdurbin pdurbin added this to the 6.3 milestone Apr 17, 2024
@pdurbin pdurbin removed their assignment Apr 17, 2024
@ErykKul
Copy link
Collaborator Author

ErykKul commented Apr 18, 2024

@pdurbin, thanks for catching that and merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Size: 3 A percentage of a sprint. 2.1 hours.
Projects
Status: High priority
Development

Successfully merging this pull request may close these issues.

Support for s3 without tagging (e.g., s3 api on top of swift)
5 participants