-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce AzureOpenAI transcription support #902
Introduce AzureOpenAI transcription support #902
Conversation
Hi @piotrooo , FYI, the other day I've tried to migrate to the 1.0.0-beta.9. Fixed some compilation and other issues but got stuck with strange test failures (structured output stopped working) so left it for when i have time. If interested here is the branch: https://github.com/tzolov/spring-ai/tree/update_azure_openai_client_version |
Sure @tzolov 👍 I'll look at this on Monday. Could you give me a hint about the falling tests? |
@piotrooo, seem like the structured output converters (e.g. AzureOpenAiChatModelIT's listOutputConverter, mapOutputConverter, beanOutputConverter) are not working after the upgrade. |
@tzolov, it seems to be a bug in the newly introduced JSON serializer in the Azure SDK. Here is a referenced PR with a fix. It looks like a problem with content serialization and deserialization in Azure. I made a test:
Response:
Everything works correctly. Unfortunately, we need to wait for the fix. |
90a6204
to
dc0eb97
Compare
dc0eb97
to
d479c36
Compare
@tzolov, it seems thet everything works as expected. I've checked it in our Azure subscription on our internal models. Could you confirm it is working as expected? |
@piotrooo thanks for the update. |
Yes, for sure. |
Hi @piotrooo , The Antora docs source is under: https://github.com/spring-projects/spring-ai/tree/main/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/transcriptions Finally you can add the page to the catalog in front of the openai transcription:
|
360a779
to
4400048
Compare
@tzolov I've just added docs. |
44c2ca6
to
713657f
Compare
@tzolov I think I did everything. The code compiles for me. |
Thank you @piotrooo I could not find integration tests for the azure transcription service, so i've tired to add such in my merge branch: https://github.com/tzolov/spring-ai/tree/gh-902-pr
But they both ITs fail. Could you have a look please |
@tzolov I've added the missing test you mentioned. I've also tested it on our internal models.
|
Thank you @piotrooo ! Great stuff |
Small test adjustments, rebased, squashed and merged at 0e97f9c |
Motivation
Right now, Azure (Microsoft) is the biggest shareholder of OpenAI, and it puts a lot of effort into providing OpenAI
services. I think it is reasonable to have as many supported models as possible.
Description
Caution
This PR introduces breaking changes.
Classes from the
org.springframework.ai.openai.metadata.audio.transcription
package have been moved to theorg.springframework.ai.audio.transcription
package.The repeated part was moved to the core package.
The
AzureOpenAiAudioTranscriptionModel
has been added to the auto-configuration with the following properties.The
spring.ai.azure.openai.audio.transcription
prefix was introduced for properties. It also introduces options properties which cover all of them (see:AzureOpenAiAudioTranscriptionOptions
).The Azure SDK has been bumped to version
1.0.0-beta.9
. This upgrade introduced a change in the JSON field - theprompt_annotations
field was changed toprompt_filter_results
(ref: Azure/azure-rest-api-specs#25880).Another significant change in the SDK was the replacement of
jackson-databind
withazure-json
(ref: Azure/azure-sdk-for-java#39825). As a result, in theAzureOpenAiEmbeddingModel
class, we cannot useModelOptionsUtils
as it is based on the@JsonProperty
annotation. I replaced this feature with manual assignment.TODO
There are a couple of things to do. I don't want to do them right now for the sake of acceptance. But for sure, it should be done after the PR is merged.
AudioTranscriptionMetadata
for both OpenAI and Azure OpenAI with information from theVERBOSE_JSON
response format (@tzolov what do you think?)