Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DeepSeek model client #702

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mxsl-gr
Copy link
Contributor

@mxsl-gr mxsl-gr commented May 10, 2024

Hi, this PR is add DeepSeek model client and has passed unit testing.
I can provide my api_key if needed for testing

the PR content:

  • DeepSeek chat client
  • spring starter
  • unit test
  • chat client documents

For some reasons, products from OpenAI and others can't be directly used in Chinese Mainland.

DeepSeek is a strong, economical, and efficient Mixture of Experts (MoE) language model, with an API pricing of $0.14/$0.28 per 1 million tokens.

the link: DeepSeek

If they can supported, it will further aid spring-ai to promotion in Chinese Mainland.

If necessary, I can take care of subsequent maintenance since I'm currently using them.

The Moonshot PR is #596
The ZhiPuAI PR is #623
The MiniMax PR is #628

@markpollack markpollack self-assigned this May 21, 2024
@mxsl-gr mxsl-gr mentioned this pull request May 22, 2024
@GakkiNo1
Copy link

There are a lot of AI companies, and a lot of AI models, so the code can't be finished by creating clients all the time. Another thing is that deepseek has already said in the document that his API is fully compatible with openAI, that is to say, chatclient set the proxy address can call deepseek smoothly, i think the project maintainer needs to consider this problem: not to add a new model, but to add a new way to call,above are just my own thoughts。

@mxsl-gr
Copy link
Contributor Author

mxsl-gr commented May 23, 2024

There are a lot of AI companies, and a lot of AI models, so the code can't be finished by creating clients all the time. Another thing is that deepseek has already said in the document that his API is fully compatible with openAI, that is to say, chatclient set the proxy address can call deepseek smoothly, i think the project maintainer needs to consider this problem: not to add a new model, but to add a new way to call,above are just my own thoughts。

hi, I agree with your point. I have two thoughts.

if the companies provided an API standard compatible with OpenAI, in most cases, you can use it by just replacing the base url and API key, it's great.

while the OpenAI API design is not a standard, there are indeed many followers of its design, especially in mainland China. But this need the followers maintaining compatibility with the OpenAI API design. During the stage when their features are converging, it is predictable as they want to lower the barrier for potential users. But if they continue to develop, their paths may not necessarily converge.
A somewhat inappropriate example is Quarkus and Spring. Quarkus is relatively aggressive in native aspects, but they need to lower the barrier for developers using Spring. So, they provide Spring extensions to work in some cases to achieve low-barrier migration. However, if Spring API gets new designs, Quarkus may not necessarily follow them continuously.

Secondly, the two are not completely compatible. I encountered this with the n parameter. I have a scenario where I need to generate multiple options and then score them, but Deepseek currently does not support the n parameter. I think that if the API does not support it, it’s best not to include it in the configuration. This is something that cannot be achieved if using OpenAI implementation.

in conclusion, I think widely used model providers can still have independent implementations, but it doesn’t hurt to add a new way to call.

@markpollack markpollack added this to the 1.0.0-M2 milestone May 24, 2024
@mxsl-gr mxsl-gr force-pushed the feature/model-client-deepseek branch from 41505f6 to c34da10 Compare June 20, 2024 12:35
@mxsl-gr
Copy link
Contributor Author

mxsl-gr commented Jun 20, 2024

this PR has completed the refactoring, has been squashed and force pushed again.

@markpollack
Copy link
Member

In my experience at the model level, I agree that once you get into the details, "api compatability" is surface level. On the other hand, there is indeed a model "zoo" that can add to maintenance and carries risk of them disappearing.

I suspect that over time, once the market decides on winners and losers in the marketplace, there will be consolidation and some models in spring ai will go away. As a concrete example of that, Google's PAaM API is deprecated so we will be removing it (I'll create an issue).

@markpollack
Copy link
Member

markpollack commented Aug 20, 2024

I'm still not sure what to do here. Other opinions? @mxsl-gr have you tried to use Spring AI's OpenAI support to talk with this model?

@markpollack markpollack modified the milestones: 1.0.0-M2, 1.0.0-RC1 Aug 20, 2024
@mxsl-gr
Copy link
Contributor Author

mxsl-gr commented Aug 23, 2024

I'm still not sure what to do here. Other opinions? @mxsl-gr have you tried to use Spring AI's OpenAI support to talk with this model?

hi @markpollack , i have been busy with other work recently. we are using a customized version built on my local branch, which merges this branch with other strange features. for example, we have a multi-level model that can handle tasks at different levels to improve speed or reduce costs, as well as a client pool and polling mechanisms to enhance throughput and redundancy.
i just ran the OpenAI model unit tests using DeepSeek's base url and key, and they worked for most scenarios, i think with configuration the OpenAI model can be used in most cases. but there are some properties not supported by DeepSeek, such as the parameter n mentioned earlier. if we ignore those properties, so this PR can be closed.

however, considering that DeepSeek has a significant user base in Mainland China, now even surpassing ZhiPuAI and Minimax. i think having an independent implementation isn't necessarily a bad idea. if that's the case, i might need to adjust this branch, including the implementation of function calls and adding more unit tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants