Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added features targeted for developers writing custom clients #332

Closed
wants to merge 1 commit into from

Conversation

NickolausDS
Copy link
Contributor

These changes make it easier for developers to create their own
clients by adding simple built-in token storage, and a customized
auth flow with a built-in local server. A built-in client_id was
also added to reduce the barrier of entry required for a native auth
flow. Developers writing simple scripts can simply do:

from globus_sdk import native_auth
my_tokens = native_auth()

No functionality here is meant to replace existing SDK functionality.
Instead, the changes are intended to address commonly copied and
re-implemented code.

Adding @rpwagner and @ranantha by request.

To keep things as simple, only native_auth was added to __init__.py.

This does address #282, but these changes don't fully address address
the requirements there. Mainly, local_server functionality isn't exposed
for general use and templates can't be customized for branding.

I'll make notes about changes here I think warrant extra discussion.

``native_auth()``

Or to save tokens: ``native_auth(save_tokens=True)``
"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The native_auth function was written from the perspective of making the native auth flow as simple as humanly possible. At its most basic, import a thing and then call it. Having this be a simple function call makes it very easy to include in a simple script or serve as a basic client, and this will be especially useful fo folks creating their own custom resource servers and need a small native app to test. It’s intended to serve a very narrow but very common use case, and wasn’t intended to support all features of the NativeAppAuthClient.


``server_port`` (*string*)
Port for the local server to use. No effect if ``no_local_server``
is set. MUST be specified in ``redirect_uri``. Defaults to 8890.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In retrospect, I didn't realize auth now supported port numbers not explicitly accounted for in the redirect_urls for the registered app. With that, server_hostname, and server_port options here should probably be removed.

@@ -44,11 +44,13 @@ class GlobusConfigParser(object):
"""

_GENERAL_CONF_SECTION = "general"
DEFAULT_WRITE_PATH = os.path.expanduser("~/.globus-native-apps.cfg")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a debate about either using the built-in config or specifying a separate one for all native apps. Having a separate config here insulates the SDK's config from other general native apps using this functionality, but it's a little ugly in that we now have two config files.

"""
return _get_parser()


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_parser here was made public in order to support save_tokens() from a different module saving multiple values. An alternative is having separate token specific functionality here for setting those specific values.



def safe_print(message, *args, **kwargs):
get_safe_io().write(message, *args, **kwargs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a user specifies the option for no_local_server=True, or no_browser=True, we need to print to a console so the user can copy/paste the auth code. In most or probably all cases, we could just print to stderr and be fine. But libraries like the globus-cli use @click.echo and this was intended as a way to respect printing output for libraries like those.

logger = logging.getLogger(__name__)


def save_tokens(tokens, config_section=None):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_tokens is intended to be used for arbitrary token storage, specifically by people registering their own scopes and wanting to easily save them for future use. This is most useful for people writing very simple clients or scripts. When somebody wants to write a custom client and save their own specific (non-token) related config-values, they should use their own config and not rely on the globus_sdk. A compromise there is to not expose this directly, and only allow it to be used indirectly by other SDK functionality.

These changes make it easier for developers to create their own
clients by adding simple built-in token storage, and a customized
auth flow with a built-in local server. A built-in client_id was
also added to reduce the barrier of entry required for creating
a new client. Developers writing simple scripts can simply do:

    from globus_sdk import native_auth
    my_tokens = native_auth(save_tokens=True)

No functionality here is meant to replace existing SDK functionality.
Instead, the changes are intended to address commonly copied and
re-implemented code.
@jimPruyne
Copy link
Contributor

I have not reviewed code or read all of Nick's updates, but I did want to register a +1 on the concept. Helpers for creating new Native App clients supporting token storage, etc. would be something I'd have used a couple times by now.

@sirosen
Copy link
Member

sirosen commented Nov 6, 2018

I find it difficult to put all of my thoughts in order on this. This is the short version:

This PR significantly expands the surface of the SDK and would constitute potentially significant maintenance burden. I'm really concerned about merging it in. My gut tells me that the right thing to do is to build it as a distinct python package entirely -- not part of the SDK.

That was the view I expressed in #282 as well.

I'm closing because I think the appropriate next step is for this to get its legs as a distinct, standalone lib (on pypi or just in a GitHub repo). Then we can talk about what use-cases it does or does not satisfy, change the interface in significant and breaking ways version-to-version -- a luxury we are not afforded here -- and potentially work our way up to something we'd be comfortable moving into the SDK.

This is the longer version:

Optics aside, I think that's the best path for supporting a collection of utilities which touch a number of things which we have very carefully avoided to date.

That also sets the stage for discussions about porting individual, well-isolated pieces into the SDK, and converting this second-order library to be a consumer of these additions. Those discussions can take place about individual components, and would be much more productive than trying to take this in all at once.

There are important considerations missing from this work so far. Most prominently, this only appears to implement a small part of what you need to actually manage user logins.
You need login and logout at a minimum. Session boost if you want to use HA endpoints. And you need to think about how the interface will look when consumed by multiple distinct applications. As written, two naive callers of this new function can see badly inconsistent behavior, as they "thrash" the set of scopes in use, and only function when they were the last application used for login.

This work needs to be expanded to handle more sophisticated -- but realistic -- cases before we can seriously consider it. And I'd like to get all of the maintainers involved in discussing what exactly it is we're trying to build, and why, before a bunch of code is written and we might have the unpleasant, but self-preservatory, duty of rejecting it.

Given that I haven't had time in the past calendar year (and this is not to my credit) to do major and important work on documentation for features we already have, I find it difficult to hop onboard with significant expansions to our feature-set beyond without some strong justification.

We need to be really careful what we sign up for here in part because the SDK has a limited dependency load and high portability as one of its core goals. People use the SDK to implement resource servers, not just to implement clients.

Trying to be portable, lightweight, only-what-you-need, is in tension with trying to build something "batteries included". Batteries-included things are great, but they need to be clearly delineated and separated from the core, underlying components out of which they are built.
If this does move into the SDK, I'd want us to discuss putting it in a distinct subpackage as a "porcelain" layer.

Also, trying to treat good documentation as an "after the fact" effort is more or less doomed, in my experience. We need to be working on quality documentation as we write it, or it will never get written.

@sirosen sirosen closed this Nov 6, 2018
@NickolausDS
Copy link
Contributor Author

Hmm, the things that strike me as most evident is not having a larger discussion beforehand about what problems we need to solve and what would be the best way to go about solving them.

Firstly, in response to this:

This PR significantly expands the surface of the SDK and would constitute potentially significant maintenance burden.

This is definitely a fair point and probably my biggest overall worry when submitting it. These changes do address some current limitations in the SDK, especially with regard to creating new standalone clients, like the Globus CLI. However, Internally the changes do expand the SDK past what it was originally intended to do (I'm thinking specifically of the config changes).

Moving back to a discussion about the use-cases, there are two major ones:

The first is developers that want to build their own resource servers. Building the resource server is straight forward, but what if the developer needs a command line client to talk to it? The SDK contains features for doing a native app flow with a registered Globus App, but it lacks token storage and using a local server with the native auth flow. Those aren't critical features but severely dampen the user experience if they're not present, so as a result in practice the features are simply copied from the Globus CLI.

The second use-case is developers that don't really need a full-blown SDK but need to write a quick python script to use globus services. This covers both folks that only want to use core Globus services such as transfer, but is also a necessity for anyone creating their own resource server (You do need to test it at some point, after all). The same two features from the first use case are needed here, but the cost of having them for a simple script are even higher.

Given those two use cases, as you suggest, a good solution would be implementing them in a separate package that users could import if they needed it. The only downside I can see is the need to support a separate package. Apart from that, I would argue these changes don't add any external dependencies and wouldn't impact developers that don't use them in any significant way. But again, as you suggest, putting them in a separate package then transferring them over if they're successful seems a great way to test their necessity before supporting them as part of the SDK.

Some other points:

There are important considerations missing from this work so far. Most prominently, this only appears to implement a small part of what you need to actually manage user logins.
You need login and logout at a minimum. Session boost if you want to use HA endpoints. And you need to think about how the interface will look when consumed by multiple distinct applications. As written, two naive callers of this new function can see badly inconsistent behavior, as they "thrash" the set of scopes in use, and only function when they were the last application used for login.

This does handle login, logout, specifying custom auth params, revoking tokens, logging in with multiple applications, and handling the edge-case where a user calls this multiple times in the same application with different requested scopes, which it handles by revoking the old ones and clears them out before new ones are saved. A huge criticism of the code is it doesn't expose this functionality for developers, specifically loading or revoking tokens. However, I think there's a lot of discussion there on what we want to expose to developers to allow them to do, and at what point we would want to prevent developers from attempting to save things through the SDK vs forcing them to create their own config files.

This work needs to be expanded to handle more sophisticated -- but realistic -- cases before we can seriously consider it. And I'd like to get all of the maintainers involved in discussing what exactly it is we're trying to build, and why, before a bunch of code is written and we might have the unpleasant, but self-preservatory, duty of rejecting it.

This code was written in response to the most common use cases people have for doing a native auth flow, and specifically code that is commonly copied out of the Globus cli. But you're right, the scope of these changes was too large not to have had a discussion about the specific use cases. And in addition, finding best solution in solving them while keeping in mind the cost it will take to maintain them in the future.

@sirosen
Copy link
Member

sirosen commented Nov 7, 2018

We can discuss more via other fora, but...

The SDK contains features for doing a native app flow with a registered Globus App, but it lacks token storage and using a local server with the native auth flow.

This is intentional.
I'm well aware that people want these, and many other features from the SDK.

In fact, #282 was opened in part so that I could clearly lay out the rationale for not including these features in the SDK and making sure that anyone who wanted to add them would at least start by creating their separate, standalone project.
It even includes some notes on a good potential design for such a feature.

The second use-case is developers that don't really need a full-blown SDK but need to write a quick python script to use globus services.

I think what you actually mean is "plastering over the cracks left by Globus Auth not supporting personal access tokens or developer API keys".
I am quite unhappy myself with the amount of nonsense we have to put new developers through just to get started, but I'm not yet convinced that this is the right place to solve it.

I'm definitely not convinced that enough thought has gone into the design of an interface that we can expose to users in our next SDK release without shooting ourselves in the foot.

The only downside I can see is the need to support a separate package.

When you write something, you have to support it. Otherwise, it's just more abandonware added to the web.

We've been very choosy about what features we include here, being sure to only include those that we're willing to help users troubleshoot on any supported platform, including pain points like old versions of Windows running old versions of python2.7.

This does handle login, logout, ...

Only one of these things is exported at the top level. And there's no narrative documentation to suggest or guide users through proper usage. On my first read, I saw native_auth (which, by the way, is very ambiguous in its meaning) as the only top-level export. So the only public-facing interface appears to be login.

Are users meant to import from globus_sdk.utils.local_server? Why or why not?

Consider: what does minimal, production-ready usage of your code look like? Does the documentation guide users to success?

A single function with a lot of params may be a very simple interface, but it's not a always very good one.

Some of what I'm missing with this is:

  • What is the interface that you're trying to design for users?
  • Will it work when used correctly and fail when used incorrectly?
  • Will users be able to understand what it's doing and know what side-effects it will have?
  • Which parts of the problem are you solving, and which parts of the problem are left for users to solve?

At a minimum, this change seems to be adding

  • The features documented as best served in an external lib in Feature Request: add AuthCode local webbrowser handler to SDK #282
  • A built-in client ID; I'm not sure this is a good idea
  • Read-write to a shared ini file for all client code
  • A wrapper around print?
  • A set of top-level functions, meant to be imported from globus_sdk and from globus_sdk.utils.token_storage

What people want more than anything:

def gen_on_refresh(resource_server, filename):
    def on_refresh(token_response):
        tok = token_response.by_resource_server[resource_server]
        with open(filename, 'w') as f:
            f.write(json.dumps(tok))
    return on_refresh

🎉 But that's not really good enough for everyone. There are a million and one variations on this theme depending on what users need out of it.

We very much on purpose made where and how you store tokens an external problem to be solved by the user, and how you get tokens to be an internal problem solved by the SDK.
That was something we learned by looking at and using oauth2client, which does not distinguish the responsibilities for these two activities. That decision weighs in at around 2000 LOC (for the in-memory implementation alone!) + a clunky and uncomfortable interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants