Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add latency and token_usage info in ai-proxy access log #12042

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Revolyssup
Copy link
Contributor

Description

Fixes # (issue)
After the traffic passes through the AI Gateway (also known as APISIX), users hope to: be able to record more variables required for AI proxy scenarios in the access log of AI requests and responses.
Some metadata explicitly mentioned by the user, and corresponding explanations:

  • Token: The token usage for each request and response phase needs to be recorded separately.
  • Latency: The waiting time for the first response after the request is sent to the AI Instance through the AI Gateway proxy. - Time to first byte.
127.0.0.1 - - [12/Mar/2025:16:24:57 +0530] 127.0.0.1:9080 "POST /anything HTTP/1.1" 429 349 1.088 "-" "curl/8.9.1" - - - "https://somerandom.com "ai_token_usage={\x22prompt_tokens\x22:0,\x22completion_tokens\x22:0,\x22total_tokens\x22:0}" "ai_time_to_first_byte(in seconds)=1.0880000591278""

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant