Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(ai-plugins): streamline AI Proxy streaming system & add compatibilty for existing "client SDKs" #12903

Merged
merged 24 commits into from
Apr 26, 2024

Conversation

tysoekong
Copy link
Contributor

Summary

This PR supersedes:

by completely re-writing the AI Proxy streaming from "exit early" to something far more robust, using the existing body_filter transformer system.

This came out of testing with Customer G, where things like '\r\n' string literal being in any of the streaming response frame would break the plugin. Through not exiting early, it's also consistent with other AI service calls, and will produce much better compatibility with future AI roadmap functions later on.

It also fixes all the PR comments relating to the "Enable SDK compatibility", which PR is now closed and superseded by this one.

Checklist

  • The Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - docs are coming once this is approved...

Issue reference

KAG-4126

@tysoekong tysoekong changed the title Feat/streamline ai proxy Streamline AI Proxy streaming system; Add compatibilty for existing "client SDKs" Apr 22, 2024
@tysoekong tysoekong marked this pull request as ready for review April 22, 2024 18:10
kong/llm/drivers/azure.lua Outdated Show resolved Hide resolved
Copy link
Contributor

@hanshuebner hanshuebner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to spend some more time with this, but here are a couple of superficial suggestions.

--
-- @param path string the path to ensure is valid
-- @return string the newly-formatted valid path, or the original path if nothing changed
function _M.ensure_valid_path(path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of kong.tools.utils, ensure_valid_path is not an appropriate name, as "path validity" is not a well defined concept. I would suggest start_with_slash or something that is more narrow and precise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was like this because I was thinking to add a bunch more checks later, and the stencil function would make sense for others to use.

I have removed it for now - I can in next round of features if it's still needed.

Copy link
Contributor

@hanshuebner hanshuebner Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still here, still not working for more than two slashes. use gsub

@hanshuebner It won't let me reply for some reason. This was a mistake and is no longer used, so I've removed it!

kong/tools/http.lua Outdated Show resolved Hide resolved
@kikito kikito requested a review from jschmid1 April 23, 2024 09:03
Copy link
Contributor

@jschmid1 jschmid1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1/n

first pass; submitting to provide early feedback

kong/llm/drivers/anthropic.lua Outdated Show resolved Hide resolved
kong/llm/drivers/azure.lua Show resolved Hide resolved
kong/llm/drivers/azure.lua Outdated Show resolved Hide resolved
kong/llm/drivers/azure.lua Outdated Show resolved Hide resolved
kong/llm/drivers/azure.lua Show resolved Hide resolved
@ttyS0e
Copy link
Contributor

ttyS0e commented Apr 23, 2024

@hanshuebner @jschmid1 I've fixed everything in comments here

Copy link
Contributor

@jschmid1 jschmid1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2/n

I've had some things queued up

kong/llm/drivers/cohere.lua Outdated Show resolved Hide resolved
kong/llm/drivers/cohere.lua Outdated Show resolved Hide resolved
kong/llm/drivers/openai.lua Outdated Show resolved Hide resolved
kong/llm/drivers/shared.lua Show resolved Hide resolved
kong/llm/drivers/shared.lua Outdated Show resolved Hide resolved
kong/llm/init.lua Show resolved Hide resolved
kong/llm/init.lua Show resolved Hide resolved
kong/llm/init.lua Show resolved Hide resolved
kong/llm/drivers/azure.lua Show resolved Hide resolved
@jschmid1 jschmid1 changed the title Streamline AI Proxy streaming system; Add compatibilty for existing "client SDKs" refactor(ai-plugins): streamline AI Proxy streaming system & add compatibilty for existing "client SDKs" Apr 23, 2024
@ttyS0e
Copy link
Contributor

ttyS0e commented Apr 24, 2024

@jschmid1 I've done all comments, the only thing I don't know is the 'compatibility check' on the plugin conf schema, RE new field "upstream_path"

tysoekong added a commit that referenced this pull request Apr 24, 2024
Copy link
Contributor

@jschmid1 jschmid1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good now, but I feel a bit uneasy without testing against actual endpoints. I agree that using mocked endpoints is beneficial for per-commit testing, but we should also implement a daily or weekly test against real endpoints.

Testing with these providers helps uncover issues related to timing, unexpected responses, changes in their API, and more.

kong/llm/init.lua Show resolved Hide resolved
kong/llm/init.lua Show resolved Hide resolved
@jschmid1
Copy link
Contributor

Actually, lets get the version compat code in this PR as it would break tests otherwise.

@tysoekong tysoekong force-pushed the feat/streamline_ai_proxy branch 2 times, most recently from 39d6c0a to b65a540 Compare April 24, 2024 15:03
@tysoekong
Copy link
Contributor Author

@jschmid1 Please hold the merge for one second, Antoine has fixes here that we should bring in here so that we don't have another EE cherry pick.

@AntoineJac
Copy link
Contributor

@jschmid1 , ok I think we are good to go. Thanks

@hanshuebner
Copy link
Contributor

Are further changes planned in this PR? @jschmid1 is your approval still OK?

@AntoineJac
Copy link
Contributor

@hanshuebner , this PR is ready.
@jschmid1 can you please confirm it is ok and merge? Thanks

@jschmid1
Copy link
Contributor

Let me give it one last pass

@jschmid1 jschmid1 merged commit 3980a63 into master Apr 26, 2024
25 checks passed
@jschmid1 jschmid1 deleted the feat/streamline_ai_proxy branch April 26, 2024 11:59
@github-actions github-actions bot added the incomplete-cherry-pick A cherry-pick was incomplete and needs manual intervention label Apr 26, 2024
@AntoineJac
Copy link
Contributor

@jschmid1 @ttyS0e can someone please cherry pick this one to Kong-EE?
Thanks

@ttyS0e
Copy link
Contributor

ttyS0e commented Apr 26, 2024

I'll do it yep, I will handle all the EE work

@kikito kikito removed the incomplete-cherry-pick A cherry-pick was incomplete and needs manual intervention label Apr 30, 2024
@Kong Kong deleted a comment from team-gateway-bot Apr 30, 2024
locao pushed a commit that referenced this pull request Jun 21, 2024
…eaming system & add comp… (#8955)

* refactor(ai-plugins): streamline AI Proxy streaming system & add compatibilty for existing "client SDKs" (#12903)
* feat(ai-proxy): complete refactor of streaming subsystem

---------

Signed-off-by: Joshua Schmid <jaiks@posteo.de>
Co-authored-by: Antoine Jacquemin <ajacquem@gmail.com>
Co-authored-by: Joshua Schmid <jaiks@posteo.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants