Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image processing not working with Anthropic API through gateway. #652

Closed
hongyi-zhao opened this issue Oct 3, 2024 · 2 comments
Closed
Labels

Comments

@hongyi-zhao
Copy link

Dear portkey-ai/gateway maintainers,

I'm encountering an issue using the Anthropic API's image-processing capabilities through your gateway. While direct API calls to Anthropic work correctly, requests routed through the gateway fail to process the image data.

Here are the details:

  1. Gateway version: The latest version installed via npx
  2. Endpoint used: http://localhost:8787/v1/chat/completions
  3. Model: claude-3-5-sonnet-20240620

The complete test script is as follows:

#!/bin/bash

#https://docs.anthropic.com/en/docs/build-with-claude/vision
IMAGE_URL="https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
IMAGE_MEDIA_TYPE="image/jpeg"
# 获取图片并进行base64编码
IMAGE_BASE64=$(curl -s -x socks5h://127.0.0.1:16668 "$IMAGE_URL" | base64 -w 0)

# Start gateway as follows:
#$ proxychains-ng-allowed_countries npx @portkey-ai/gateway

curl "http://localhost:8787/v1/chat/completions"   \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json"   \
-H 'x-portkey-config: {"provider":"anthropic","api_key":"sk-ant-xxx"}'   \
-d @- << EOF
{
  "model": "claude-3-5-sonnet-20240620",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": "$IMAGE_MEDIA_TYPE",
            "data": "$IMAGE_BASE64"
          }
        },
        {
          "type": "text",
          "text": "What is in the above image?"
        }
      ]
    }
  ]
}
EOF

The result of the above script is as follows:

{"id":"msg_01MtiR433bzLRRGz5hDYx9q3","object":"chat_completion","created":1727957242,"model":"claude-3-5-sonnet-20240620","provider":"anthropic","choices":[{"message":{"role":"assistant","content":"I apologize, but I don't see any image in our conversation. You haven't uploaded or shared an image with me yet. If you'd like me to analyze an image, please upload one and I'll be happy to describe what I see in it."},"index":0,"logprobs":null,"finish_reason":"end_turn"}],"usage":{"prompt_tokens":14,"completion_tokens":56,"total_tokens":70}}

As you can see, the response from the gateway indicates that no image was provided, despite the image data being included in the request body.

When making the same request directly to the Anthropic API (https://api.anthropic.com/v1/messages), it works correctly and the image is processed.

Could you please advise on whether image processing is supported through the gateway, and if so, what might be causing this issue? Any guidance on how to resolve this would be greatly appreciated.

Thank you for your time and assistance.

Best regards,
Zhao

@github-actions github-actions bot added the triage label Oct 3, 2024
@VisargD
Copy link
Collaborator

VisargD commented Oct 3, 2024

Hey @hongyi-zhao - The gateway provides an unified interface so that users can request all AI model providers using a single API signature. This unified signature is OpenAI request schema. So the request that you make should be in OpenAI compatible format. Here is an example to use anthropic vision models:

curl --location 'http://localhost:8787/v1/chat/completions' \
--header 'x-portkey-config: {"provider":"anthropic","api_key":"sk-ant-xxx"}'   \
--header 'Content-Type: application/json' \
--data '{
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url":"data:image/png;base64,iVBO..."
                    }
                }
            ]
        }
    ],
    "max_tokens": 20,
    "model": "claude-3-5-sonnet-20240620"
}'

@hongyi-zhao
Copy link
Author

Thank you for pointing this out. The following does the trick:

curl "http://localhost:8787/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "x-portkey-config: {\"provider\":\"anthropic\",\"api_key\":\"$ANTHROPIC_API_KEY\"}" \
-d @- << EOF
{
  "model": "claude-3-5-sonnet-20240620",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant"
    },
    {
      "role": "user",
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "data:${IMAGE_MEDIA_TYPE};base64,${IMAGE_BASE64}"
          }
        },
        {
          "type": "text",
          "text": "What is in the above image?"
        }
      ]
    }
  ]
}
EOF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants