-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix: Optimize Stale Agent with GraphQL and Search API to resolve 429 Quota errors #3700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @ryanaiagent, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses critical Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Response from ADK Triaging Agent Hello @ryanaiagent, thank you for creating this PR! This PR is a bug fix. Could you please provide logs or a screenshot after the fix is applied? This information will help reviewers to review your PR more efficiently. Thanks! |
|
Response from ADK Triaging Agent Hello @ryanaiagent, thank you for creating this PR! This PR is a bug fix. Could you please associate a GitHub issue with this PR? If there is no existing issue, could you please create one? In addition, could you please provide logs or a screenshot after the fix is applied to help reviewers better understand the fix? This information will help reviewers to review your PR more efficiently. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is an excellent and comprehensive refactoring of the stale agent. The switch from inefficient, paginated REST API calls to a single, targeted GraphQL query per issue is a major performance win. Similarly, using the GitHub Search API for server-side filtering of old issues drastically reduces unnecessary processing. The introduction of asyncio for concurrent processing and urllib3 retries for API resilience makes the agent much more robust and efficient. The new logic to detect 'ghost edits' is a clever addition that improves the accuracy of the staleness detection. My review includes a few suggestions for improving maintainability and making some of the new logic more configurable, but overall this is a very strong set of changes.
| logger.debug(f"#{issue_number}: Initializing runner and session.") | ||
|
|
||
| try: | ||
| runner = InMemoryRunner(agent=root_agent, app_name=APP_NAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The InMemoryRunner is being instantiated inside process_single_issue, which means a new runner object is created for every single issue being processed. While InMemoryRunner is lightweight, it's more efficient to create it once outside the processing loop and reuse it for all issues.
For example:
# In main()
runner = InMemoryRunner(agent=root_agent, app_name=APP_NAME)
tasks = [process_single_issue(runner, issue_num) for issue_num in chunk]
# In process_single_issue()
async def process_single_issue(runner: InMemoryRunner, issue_number: int) -> ...:
# ...
# runner = InMemoryRunner(...) # REMOVE THIS
session = await runner.session_service.create_session(...)
# ...|
Hi @xuanyang15 , Can you please review this. TL;DRFixed
|
…Quota errors Merge #3700 ### Description This PR refactors the `adk_stale_agent` to address `429 RESOURCE_EXHAUSTED` errors encountered during workflow execution. The previous implementation was inefficient in fetching issue history (using pagination over the REST API) and lacked server-side filtering, causing excessive API calls and huge token consumption that breached Gemini API quotas. The new implementation switches to a **GraphQL-first approach**, implements server-side filtering via the Search API, adds robust concurrency controls, and significantly improves code maintainability through modular refactoring. ### Root Cause of Failure The previous workflow failed with the following error due to passing too much context to the LLM and processing too many irrelevant issues: ```text google.genai.errors.ClientError: 429 RESOURCE_EXHAUSTED. Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_paid_tier_input_token_count ``` ### Key Changes #### 1. Optimization: REST → GraphQL (`agent.py`) * **Old:** Fetched issue comments and timeline events using multiple paginated REST API calls (`/timeline`). * **New:** Implemented `get_issue_state` using a single **GraphQL** query. This fetches comments, `userContentEdits`, and specific timeline events (Labels, Renames) in one network request. * **Refactoring:** The complex analysis logic has been decomposed into focused helper functions (_fetch_graphql_data, _build_history_timeline, _replay_history_to_find_state) for better readability and testing. * **Configurable:** Added GRAPHQL_COMMENT_LIMIT and GRAPHQL_TIMELINE_LIMIT settings to tune context depth * **Impact:** Drastically reduces the data payload size and eliminates multiple API round-trips, significantly lowering the token count sent to the LLM. #### 2. Optimization: Server-Side Filtering (`utils.py`) * **Old:** Fetched *all* open issues via REST and filtered them in Python memory. * **New:** Uses the GitHub Search API (`get_old_open_issue_numbers`) with `created:<DATE` syntax. * **Impact:** Only fetches issue numbers that actually meet the age threshold, preventing the agent from wasting cycles and tokens on brand-new issues. #### 3. Concurrency & Rate Limiting (`main.py` & `settings.py`) * **Old:** Sequential execution loop. * **New:** Implemented `asyncio.gather` with a configurable `CONCURRENCY_LIMIT` (set to 3). * **New:** Added `urllib3` retry strategies (exponential backoff) in `utils.py` to handle GitHub API rate limits (HTTP 429) gracefully. #### 4. Logic Improvements ("Ghost Edits") * **New Feature:** The agent now detects "Ghost Edits" (where an author updates the issue description without posting a new comment). * **Action:** If a silent edit is detected on a stale candidate, the agent now alerts maintainers instead of marking it stale, preventing false positives. ### File Comparison Summary | File | Change | | :--- | :--- | | `main.py` | Switched from `InMemoryRunner` loop to `asyncio` chunked processing. Added execution timing and API usage logging. | | `agent.py` | Replaced REST logic with GraphQL query. Added logic to handle silent body edits. Decomposed giant get_issue_state into helper functions with docstrings. Added _format_days helper. | | `utils.py` | Added `HTTPAdapter` with Retries. Added `get_old_open_issue_numbers` using Search API. | | `settings.py` | Removed `ISSUES_PER_RUN`; added configuration for CONCURRENCY_LIMIT, SLEEP_BETWEEN_CHUNKS, and GraphQL limits. | | `PROMPT_INSTRUCTIONS.txt` | Simplified decision tree; removed date calculation responsibility from LLM. | ### Verification The new logic minimizes token usage by offloading date calculations to Python and strictly limiting the context passed to the LLM to semantic intent analysis (e.g., "Is this a question?"). * **Metric Check:** The workflow now tracks API calls per issue to ensure we stay within limits. * **Safety:** Silent edits by users now correctly reset the "Stale" timer. * **Maintainability:** All complex logic is now isolated in typed helper functions with comprehensive docstrings. Co-authored-by: Xuan Yang <xygoogle@google.com> COPYBARA_INTEGRATE_REVIEW=#3700 from ryanaiagent:feat/improve-stale-agent 888064e PiperOrigin-RevId: 838885530
|
Thank you @ryanaiagent for your contribution! 🎉 Your changes have been successfully imported and merged via Copybara in commit cb19d07. Closing this PR as the changes are now in the main branch. |
Description
This PR refactors the
adk_stale_agentto address429 RESOURCE_EXHAUSTEDerrors encountered during workflow execution. The previous implementation was inefficient in fetching issue history (using pagination over the REST API) and lacked server-side filtering, causing excessive API calls and huge token consumption that breached Gemini API quotas.The new implementation switches to a GraphQL-first approach, implements server-side filtering via the Search API, adds robust concurrency controls, and significantly improves code maintainability through modular refactoring.
Root Cause of Failure
The previous workflow failed with the following error due to passing too much context to the LLM and processing too many irrelevant issues:
Key Changes
1. Optimization: REST → GraphQL (
agent.py)/timeline).get_issue_stateusing a single GraphQL query. This fetches comments,userContentEdits, and specific timeline events (Labels, Renames) in one network request.2. Optimization: Server-Side Filtering (
utils.py)get_old_open_issue_numbers) withcreated:<DATEsyntax.3. Concurrency & Rate Limiting (
main.py&settings.py)asyncio.gatherwith a configurableCONCURRENCY_LIMIT(set to 3).urllib3retry strategies (exponential backoff) inutils.pyto handle GitHub API rate limits (HTTP 429) gracefully.4. Logic Improvements ("Ghost Edits")
File Comparison Summary
main.pyInMemoryRunnerloop toasynciochunked processing. Added execution timing and API usage logging.agent.pyutils.pyHTTPAdapterwith Retries. Addedget_old_open_issue_numbersusing Search API.settings.pyISSUES_PER_RUN; added configuration for CONCURRENCY_LIMIT, SLEEP_BETWEEN_CHUNKS, and GraphQL limits.PROMPT_INSTRUCTIONS.txtVerification
The new logic minimizes token usage by offloading date calculations to Python and strictly limiting the context passed to the LLM to semantic intent analysis (e.g., "Is this a question?").
Testing Plan
I have verified these changes on my personal fork by manually triggering the workflow to ensure it handles API rate limits correctly and processes issues without crashing.
Checklist