Skip to content

[AI] Add hybrid inference support in GenerativeModelSession#16043

Draft
andrewheard wants to merge 6 commits intomainfrom
ah/ai-hybrid-session
Draft

[AI] Add hybrid inference support in GenerativeModelSession#16043
andrewheard wants to merge 6 commits intomainfrom
ah/ai-hybrid-session

Conversation

@andrewheard
Copy link
Copy Markdown
Contributor

WIP - not ready for review

Started adding support for hybrid (on-device and cloud) inference. This is internally implemented as an array of fallback models, trying one model session and moving onto the next. This will be publicly exposed as "prefer cloud" or "prefer on-device", which just impacts the order of the models in the array. This could be expanded to other fallback strategies in the future if desired (e.g., Vertex AI --> Gemini Dev API, Gemini 3.1 --> Gemini 2.5) to handle cases when backends or models are resource constrained.

Note: Streaming is not yet implemented and fails if the first preference fails.

TODOs:

  • Add a public API
  • Add more integration tests and add unit tests
  • Add documentation
  • Add changelog entry
  • Tons of cleanup

#no-changelog

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant