Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to specify signature_version=UNSIGNED using a string #2442

Closed
Ark-kun opened this issue Jul 18, 2021 · 8 comments
Closed

Add ability to specify signature_version=UNSIGNED using a string #2442

Ark-kun opened this issue Jul 18, 2021 · 8 comments
Labels
configuration feature-request This issue requests a feature. p3 This is a minor priority issue

Comments

@Ark-kun
Copy link

Ark-kun commented Jul 18, 2021

Describe the bug
It looks like boto3 does not support public buckets by default (why?), so everyone has to create a specially configured client:
boto3.client('s3', config=Config(signature_version=UNSIGNED))

The popular MLFlow library has the following client initialization code (simplified):

def create_client():
   return boto3.client('s3', config=Config(signature_version=os.environ.get("S3_SIGNATURE_VERSION", "s3v4")))

Boto Config and RequestSigner attribute signature_version has str type, but there does not seem to be any string that would map to the UNSIGNED python singleton object.

This makes accessing public MLFlow artifact buckets impossible.

Steps to reproduce

os.environ["S3_SIGNATURE_VERSION"] = "unsigned"
...
client = boto3.client('s3', config=Config(signature_version=os.environ.get("S3_SIGNATURE_VERSION", "s3v4")))

Get error:

UnknownSignatureVersionError: Unknown Signature Version: unsigned.

Expected behavior

I expect client = boto3.client('s3', config=Config(signature_version="unsigned")) to succeed and work as expected.

Logs
Full stack trace


/opt/conda/lib/python3.7/site-packages/mlflow/store/artifact/s3_artifact_repo.py in list_artifacts(self, path)
    113         paginator = s3_client.get_paginator("list_objects_v2")
    114         results = paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter="/")
--> 115         for result in results:
    116             # Subdirectories will be listed as "common prefixes" due to the way we made the request
    117             for obj in result.get("CommonPrefixes", []):

/opt/conda/lib/python3.7/site-packages/botocore/paginate.py in __iter__(self)
    253         self._inject_starting_params(current_kwargs)
    254         while True:
--> 255             response = self._make_request(current_kwargs)
    256             parsed = self._extract_parsed_response(response)
    257             if first_request:

/opt/conda/lib/python3.7/site-packages/botocore/paginate.py in _make_request(self, current_kwargs)
    330 
    331     def _make_request(self, current_kwargs):
--> 332         return self._method(**current_kwargs)
    333 
    334     def _extract_parsed_response(self, response):

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    384                     "%s() only accepts keyword arguments." % py_operation_name)
    385             # The "self" in this scope is referring to the BaseClient.
--> 386             return self._make_api_call(operation_name, kwargs)
    387 
    388         _api_call.__name__ = str(py_operation_name)

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    690         else:
    691             http, parsed_response = self._make_request(
--> 692                 operation_model, request_dict, request_context)
    693 
    694         self.meta.events.emit(

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _make_request(self, operation_model, request_dict, request_context)
    709     def _make_request(self, operation_model, request_dict, request_context):
    710         try:
--> 711             return self._endpoint.make_request(operation_model, request_dict)
    712         except Exception as e:
    713             self.meta.events.emit(

/opt/conda/lib/python3.7/site-packages/botocore/endpoint.py in make_request(self, operation_model, request_dict)
    100         logger.debug("Making request for %s with params: %s",
    101                      operation_model, request_dict)
--> 102         return self._send_request(request_dict, operation_model)
    103 
    104     def create_request(self, params, operation_model=None):

/opt/conda/lib/python3.7/site-packages/botocore/endpoint.py in _send_request(self, request_dict, operation_model)
    130     def _send_request(self, request_dict, operation_model):
    131         attempts = 1
--> 132         request = self.create_request(request_dict, operation_model)
    133         context = request_dict['context']
    134         success_response, exception = self._get_response(

/opt/conda/lib/python3.7/site-packages/botocore/endpoint.py in create_request(self, params, operation_model)
    114                 op_name=operation_model.name)
    115             self._event_emitter.emit(event_name, request=request,
--> 116                                      operation_name=operation_model.name)
    117         prepared_request = self.prepare_request(request)
    118         return prepared_request

/opt/conda/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
    354     def emit(self, event_name, **kwargs):
    355         aliased_event_name = self._alias_event_name(event_name)
--> 356         return self._emitter.emit(aliased_event_name, **kwargs)
    357 
    358     def emit_until_response(self, event_name, **kwargs):

/opt/conda/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
    226                  handlers.
    227         """
--> 228         return self._emit(event_name, kwargs)
    229 
    230     def emit_until_response(self, event_name, **kwargs):

/opt/conda/lib/python3.7/site-packages/botocore/hooks.py in _emit(self, event_name, kwargs, stop_on_response)
    209         for handler in handlers_to_call:
    210             logger.debug('Event %s: calling handler %s', event_name, handler)
--> 211             response = handler(**kwargs)
    212             responses.append((handler, response))
    213             if stop_on_response and response is not None:

/opt/conda/lib/python3.7/site-packages/botocore/signers.py in handler(self, operation_name, request, **kwargs)
     88         # this method is invoked to sign the request.
     89         # Don't call this method directly.
---> 90         return self.sign(operation_name, request)
     91 
     92     def sign(self, operation_name, request, region_name=None,

/opt/conda/lib/python3.7/site-packages/botocore/signers.py in sign(self, operation_name, request, region_name, signing_type, expires_in, signing_name)
    158                         signature_version=signature_version)
    159                 else:
--> 160                     raise e
    161 
    162             auth.add_auth(request)

/opt/conda/lib/python3.7/site-packages/botocore/signers.py in sign(self, operation_name, request, region_name, signing_type, expires_in, signing_name)
    152                 kwargs['signing_name'] = signing_context['signing_name']
    153             try:
--> 154                 auth = self.get_auth_instance(**kwargs)
    155             except UnknownSignatureVersionError as e:
    156                 if signing_type != 'standard':

/opt/conda/lib/python3.7/site-packages/botocore/signers.py in get_auth_instance(self, signing_name, region_name, signature_version, **kwargs)
    225         if cls is None:
    226             raise UnknownSignatureVersionError(
--> 227                 signature_version=signature_version)
    228         # If there's no credentials provided (i.e credentials is None),
    229         # then we'll pass a value of "None" over to the auth classes,

UnknownSignatureVersionError: Unknown Signature Version: unsigned.
@Ark-kun Ark-kun added the needs-triage This issue or PR still needs to be triaged. label Jul 18, 2021
Ark-kun added a commit to Ark-kun/mlflow that referenced this issue Jul 18, 2021
Currently, listing MLFlow artifacts in public S3 buckets seems impossible (without getting AWS account and credentials).

There is a `botocore` issue which prevents fixing it by specifying `MLFLOW_EXPERIMENTAL_S3_SIGNATURE_VERSION="unsigned"`. See boto/botocore#2442

This PR fixes that issue.
Ark-kun added a commit to Ark-kun/mlflow that referenced this issue Jul 18, 2021
Currently, listing MLFlow artifacts in public S3 buckets seems impossible (without getting AWS account and credentials).

There is a `botocore` issue which prevents fixing it by specifying `MLFLOW_EXPERIMENTAL_S3_SIGNATURE_VERSION="unsigned"`. See boto/botocore#2442

This PR fixes that issue.

Signed-off-by: Alexey Volkov <alexey.volkov@ark-kun.com>
@stobrien89
Copy link

Hi @Ark-kun,

Thanks for reaching out! Boto3 does support public buckets with signed calls (as long as you specify the correct region), but like you pointed out, unsigned calls can be tricky when working with other utilities. For MLflow, it looks like this is just a matter of importing the UNSIGNED object, which you've addressed in your pull request— I'm not seeing any changes we're able to make here.

@stobrien89 stobrien89 added guidance Question that needs advice or information. and removed needs-triage This issue or PR still needs to be triaged. labels Jul 20, 2021
@stobrien89 stobrien89 self-assigned this Jul 20, 2021
@Ark-kun
Copy link
Author

Ark-kun commented Jul 28, 2021

Hello, @stobrien89

I've outlined the problem in the boto3 library and proposed a fix for it.

The problem is that, despite the library types that tell the user that only strings are supported, the "unsigned" string does not work.
The solution is pretty simple as well - make the Config class recognize the "unsigned" string.

I'll create a PR for this fix.

Ark-kun added a commit to Ark-kun/botocore that referenced this issue Jul 28, 2021
This fixes some incorrect type signatures and makes botocore easier to use in CLIs and other libraries that deal with non-Python input.

Fixes boto#2442
Ark-kun added a commit to Ark-kun/botocore that referenced this issue Jul 30, 2021
This fixes some incorrect type signatures and makes botocore easier to use in CLIs and other libraries that deal with non-Python input.

Fixes boto#2442
dbczumar added a commit to mlflow/mlflow that referenced this issue Aug 3, 2021
…4573)

* Make it possible to access artifacts in public S3 artifact buckets

Currently, listing MLFlow artifacts in public S3 buckets seems impossible (without getting AWS account and credentials).

There is a `botocore` issue which prevents fixing it by specifying `MLFLOW_EXPERIMENTAL_S3_SIGNATURE_VERSION="unsigned"`. See boto/botocore#2442

This PR fixes that issue.

Signed-off-by: Alexey Volkov <alexey.volkov@ark-kun.com>

* Format code

Signed-off-by: dbczumar <corey.zumar@databricks.com>

Co-authored-by: dbczumar <corey.zumar@databricks.com>
@stobrien89 stobrien89 added feature-request This issue requests a feature. and removed closed-for-staleness guidance Question that needs advice or information. labels Aug 4, 2021
@stobrien89
Copy link

Hi @Ark-kun,

Thanks for the clarification. I'll review with the team to get their thoughts.

@stobrien89 stobrien89 reopened this Aug 4, 2021
@Ark-kun
Copy link
Author

Ark-kun commented Aug 13, 2021

@stobrien89 Thank you. I've created a simple fix PR: #2448

@stobrien89 stobrien89 added needs-review This issue or pull request needs review from a core team member. and removed needs-review This issue or pull request needs review from a core team member. labels Aug 24, 2021
@stobrien89
Copy link

Hi @Ark-kun,

Thanks for your patience. I ran this by the team and they're not opposed to accepting a change like this at some point in the future. For now, we'd like to measure the customer impact of this by seeing how much traction this gets.

I don't have permission to change the issue title, but I'd suggest shortening it to "Ability to specify signature_version=UNSIGNED using a string" or something to that effect for better visibility.

@stobrien89 stobrien89 removed their assignment Jan 7, 2022
@RyanFitzSimmonsAK RyanFitzSimmonsAK added the p3 This is a minor priority issue label Nov 9, 2022
@tim-finnigan tim-finnigan changed the title Cannoot access public buckets in MLFlow due to inability to specify signature_version=UNSIGNED using a string Add ability to specify signature_version=UNSIGNED using a string Nov 22, 2022
@melugoyal
Copy link

melugoyal commented May 31, 2023

@stobrien89 i support this change, as instantiating an AWS airflow connection as UNSIGNED doesn't appear possible at the moment: https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html#configuring-the-connection allows you to configure config_kwargs to pass arguments to botore.config.Config, and doing so with:

extra={
  "config_kwargs": {
    "signature_version": botocore.UNSIGNED,
  },
}

results in the error TypeError: Object of type UNSIGNED is not JSON serializable, since airflow later tries to serialize the entire extra struct to json

@tim-finnigan
Copy link
Contributor

Thanks for your patience here. I brought up your issue and PR once again with the team, and the consensus was that this they do not plan to do this, the reason being that we cannot support arbitrary strings here and instead rely on strictly defined enums. Also we cannot guarantee compatibility with third-party libraries like MLFlow.

@tim-finnigan tim-finnigan closed this as not planned Won't fix, can't repro, duplicate, stale Jul 2, 2024
Copy link

github-actions bot commented Jul 2, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
configuration feature-request This issue requests a feature. p3 This is a minor priority issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants