-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement]: Handle GH token refresh inside runtime #6632
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, @malhotra5 !
I'll defer to Xingyao on the actual implementation of the refresh, I just would like to stop a moment here to look at the architecture of this: this creates a dependency from the runtime
to the server
.
Can we do this differently?
To elaborate, we have contributions to different runtimes, an e2b runtime, a modal runtime. Those implementations depend on the abstract runtime interface, and they don't depend at all on the existence of a Web UI, and surely not on our particular Web UI. That's for a good reason, I believe, the runtime doesn't ever need to know it will be used from a web UI. To put it differently, |
Unrelated to the above concern:
I mean, if the token is invalid, then the agent will receive an error, right? It did for me when I deleted my tokens on github, and that's okay, the LLM will eventually ask for a valid token. |
Ah this is a really good point! I agree it should be done differently. On a separate note, can we consider the It is, however, mainly used in the server at the moment so maybe it is a server component after all - regardless I agree this PR should probably be implemented differently, thanks a bunch for the thoughtful feedback! |
Thanks to Rohit and Xingyao for the clarifications on this question: this solves a saas-only problem, where the GITHUB_TOKEN sent to the runtime from the hosted site is not the token set in the user account, but a temporary user access token as defined here. Stopping the agent would be a pretty bad experience, and it may be worse to have the LLM ask for a token so, so very often. |
PR has been updated so that
|
|
||
def __init__(self, user_id: str | None): | ||
def __init__(self, user_id: str | None = None, token: str | None = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little curious thing: we have made the token
stored in Settings a SecretStr, but in this diff I see it treated as a string.
I think we should fix the type hints, but I also expected to see some code picking the actual value from SecretStr
when needed (get_secret_value
), and I don't see any in the diff. Does the PR work as expected?🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah tested the PR and it works; the secret string is extracted and set as a string in the middleware
Because GithubService
is meant to be a standalone component, I feel it makes sense to keep the str typing as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idk, the point of SecretStr is to pass it around as much as possible as a SecretStr, until we really need the value. Otherwise, it can be accidentally dumped in logs or prints etc. 🤔 I might be missing something, sorry but could you please clarify why do the service and the middleware need to store a clear text?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind initializing GithubService
with a str
token is to enforce separation of concerns. The caller (in this case the server
component) should be responsible for decoding the SecretStr
into a str
when initializing a GithubService
. This way, GithubService
can strictly be a general purpose/single use object (very similar to how PyGithub
library behaves).
However, it seems we have a different problem as well - the server
component uses BOTH SecretStr
and str
representations for the Github token. For instance ConversationInitData
extends the Settings
class, allowing for a str
based token. Then, the str
representation is plumbed all way through to the runtime, starting from _create_new_conversation
until it gets set in the runtime env
It seems the way to rectify this would be to enforce SecretStr
is used in runtime
as well - this is likely very necessary but beyond the scope of this PR
On a separate note, the reason the middleware sets the token as str
in the request state is because all the /github
endpoints need to utilize the token. So ALL the endpoints need to fetch the SecretStr
from Settings
and decode it anyways, and the middleware helps deduplicate that logic
From what it seems, SecretStr
has only really been standardized completely for LLM_API_KEY
, but not GITHUB_TOKEN
-> EDIT: if this were to be standardized across our entire ecosystem (runtime
included) I think it would make sense for GithubService
to use a SecretStr
instead of str
on initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gh_client = GithubServiceImpl(user_id=self.user_id) | ||
token = await gh_client.get_latest_token() | ||
if token: | ||
export_cmd = CmdRunAction(f"export GITHUB_TOKEN='{token}'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example here, does this get and export a str or a SecretStr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will get a str value
End-user friendly description of the problem this fixes or functionality that this introduces
Give a summary of what the PR does, explaining any non-trivial design decisions
The GH token can expire inside the runtime (either during the session or when returning to one after a long time). This PR does the following for remote runtimes only
CmdRunActions
that are attempting to use$GITHUB_TOKEN
export GITHUB_TOKEN={token}
where the token is fetched fromGitHubService
(this diff explains this exact behavior)This ensures that the latest Github token has been exported to the env BEFORE any Github API request is executed by the agent
Link of any specific issues this addresses
To run this PR locally, use the following command: