Skip to content

feat(ai-proxy): support Google Cloud Vertex #2119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 9, 2025

Conversation

HecarimV
Copy link
Contributor

@HecarimV HecarimV commented Apr 24, 2025

Ⅰ. Describe what this PR did

support Google Cloud Vertex provider

Ⅱ. Does this pull request fix one issue?

fix: #1697

Ⅳ. Describe how to verify it

docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:v1.4.0-rc.1
    entrypoint: /usr/local/bin/envoy
    # 开启了 debug 级别日志方便调试
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    networks:
      - higress-net
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./plugin.wasm:/etc/envoy/plugin.wasm
networks:
  higress-net: {}

envoy.yaml

# File generated by hgctl. Modify as required.

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: vertex
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "vertex",                                
                                  "apiTokens": [
                                    "your-api-token"
                                  ],
                                  "geminiSafetySetting": {
                                    "HARM_CATEGORY_DANGEROUS_CONTENT": "OFF",
                                    "HARM_CATEGORY_HARASSMENT": "OFF",
                                    "HARM_CATEGORY_HATE_SPEECH": "OFF",
                                    "HARM_CATEGORY_SEXUALLY_EXPLICIT": "OFF"
                                  },       
                                  "vertexProjectId": "eastern-concord-457601-e9",
                                  "vertexRegion": "us-central1"
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: vertex
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: vertex
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: us-central1-aiplatform.googleapis.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "us-central1-aiplatform.googleapis.com"

测试非流式请求:

curl -X POST 'http: //localhost:10000/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gemini-2.0-flash-001",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "max_tokens": 100,
    "temperature": 0.3,
    "stream": false
}'

测试流式请求:

curl -X POST 'http: //localhost:10000/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gemini-2.0-flash-001",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

Ⅴ. Special notes for reviews

vertex api 文档:https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference

@codecov-commenter
Copy link

codecov-commenter commented Apr 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 46.06%. Comparing base (ef31e09) to head (d9226f4).
Report is 581 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2119       +/-   ##
===========================================
+ Coverage   35.91%   46.06%   +10.15%     
===========================================
  Files          69       81       +12     
  Lines       11576    13010     +1434     
===========================================
+ Hits         4157     5993     +1836     
+ Misses       7104     6671      -433     
- Partials      315      346       +31     

see 78 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@HecarimV HecarimV requested a review from CH3CHO May 26, 2025 07:34
Copy link
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我改了一点格式问题,麻烦pull一下。

Copy link
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还有两个地方希望能够完善一下:

  1. 更新 README.md,添加 Google Vertex 的配置说明
  2. 等到 TTL 完全到了再刷新 token 可能会因计时偏差导致部分请求使用到过期 token。建议加一个提前量,可以允许用户配置,默认可以 1分钟

| `vertexRegion` | string | 必填 | - | Google Cloud 区域(如 us-central1, europe-west4 等),用于构建 Vertex API 地址 |
| `vertexProjectId` | string | 必填 | - | Google Cloud 项目 ID,用于标识目标 GCP 项目 |
| `vertexAuthServiceName` | string | 必填 | - | 用于 OAuth2 认证的服务名称,该服务为了访问oauth2.googleapis.com |
| `vertexGeminiSafetySetting` | map of string | 非必填 | - | Gemini 模型的内容安全过滤设置。 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没看到新加的那个 ahead 。。。

Copy link
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CH3CHO CH3CHO merged commit d4e114b into alibaba:main Jun 9, 2025
12 checks passed
daixijun pushed a commit to daixijun/higress that referenced this pull request Jun 10, 2025
Co-authored-by: Kent Dong <ch3cho@qq.com>
@Colstuwjx
Copy link
Contributor

Colstuwjx commented Jun 18, 2025

Hi @HecarimV , 请教下,这个 vertexAuthServiceName 字段应该怎么配置呢,我这边有用 litellm proxy 配置一个 vertex ai 的代理,但是好像没看到有这个字段,我看你这个字段实际也没用作调用方面,而是做了一个 dns 服务发现?有一个类似的例子可以参考下看看吗


Hi @HecarimV, please tell me how to configure the vertexAuthServiceName field? I used litellm proxy to configure a vertex ai proxy, but I didn't seem to see this field. I see that your field is not actually used as a call, but made a dns service discovery? Is there a similar example that can be found?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AI Proxy's provider support Google Cloud Vertex
4 participants