Skip to content

Commit

Permalink
feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
npolshakova committed Feb 12, 2025
1 parent e1dc2c1 commit b1fe48b
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 13 deletions.
23 changes: 10 additions & 13 deletions design/10494.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,9 @@ additions to the existing API:
1. A new upstream subtype called AI
2. A new RoutePolicy sub-object called AI

These two types will allow us to get the new AI specific options without needing to create/add any new API objects
These two types will allow us to get the new AI specific options without needing to create/add any new API objects.

![ai custom_resources](ai custom_resources.png "AI APIs in relation to kgateway resources")

#### Upstream

Expand All @@ -71,7 +73,7 @@ Initially, three modes of authentication are supported:

- **Automatic Secret Integration:** kgateway reads the API key from a Kubernetes secret to handle authentication for requests on the specified path.
- **Inline API Key:** The API key is included directly in the Upstream definition.
- **Passthrough Mode:** The API key is passed through to the LLM provider as a header.
- **Passthrough Mode:** The API key in the downstream request header is passed through to the LLM provider.


For example, if a kubernetes secret `openai-secret` is defined, the corresponding upstream would be:
Expand Down Expand Up @@ -232,7 +234,7 @@ to filter offensive content, prevent misuse, and ensure ethical and responsible
The AI apis allow you to configure prompt guards to block unwanted requests to the LLM provider and mask sensitive data.
For example, you can use the AI API RoutePolicy to configure a prompt guard that parses requests sent to the LLM
provider to identify a regex pattern match. The AI gateway blocks any requests that contain the that pattern in the
provider to identify a regex pattern match. The AI gateway blocks any requests containing the pattern in the
request body. These requests are automatically denied with a custom response message. This matching is implemented
using [presidio](https://github.com/microsoft/presidio/tree/main).
Expand All @@ -250,12 +252,6 @@ spec:
ai:
promptGuard:
request:
moderation:
openai:
authToken:
secretRef:
name: openai-secret
namespace: ai-test
regex:
builtins:
- PHONE_NUMBER
Expand Down Expand Up @@ -339,7 +335,7 @@ kgateway will support chat streaming, which allows the LLM to stream out tokens
request should receive a streamed response.
* Other providers, such as the Gemini and Vertex AI providers, change the path to determine
streaming, such as the streamGenerateContent segment of the path in the Vertex AI streaming endpoint
https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:streamGenerateContent?key=<key>.
`https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:streamGenerateContent?key=<key>`.
To prevent the path you defined in your HTTPRoute from being overwritten by this streaming path, you instead indicate
chat streaming for Gemini and Vertex AI by setting `ai.routeType=CHAT_STREAMING` in your RoutePolicy resource.

Expand Down Expand Up @@ -384,13 +380,14 @@ A user can define a set of functions as tools that the model has access to, and
based on the conversation history. The user then can execute those functions on the application side, and provide results
back to the model.

Here are some common use cases for Function Calling:
1. Fetching data:
Retrieve data from internal systems before sending final response, like checking the weather or number of vacation days in an HR system.
2. Taking action
2. Taking action:
Trigger actions based on the conversation, like scheduling meetings or initiating order returns.
3. Building multi-step workflows
3. Building multi-step workflows:
Execute multi-step workflows, like data extraction pipelines or content personalization.
4. Interacting with Application UIs
4. Interacting with Application UIs:
Use function calls to update the user interface based on user input, like rendering a pin on a map or navigating a website.

Routing with user-invoked function calling should work without any additional configuration after setting up the initial
Expand Down
Binary file added design/resources/ai_custom_resources.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified design/resources/ai_request_flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b1fe48b

Please sign in to comment.