Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: ollama multimodal models don't support parallel requests yet #8531

Open
5 tasks done
AAEE86 opened this issue Sep 18, 2024 · 2 comments
Open
5 tasks done

BUG: ollama multimodal models don't support parallel requests yet #8531

AAEE86 opened this issue Sep 18, 2024 · 2 comments
Labels
🐞 bug Something isn't working

Comments

@AAEE86
Copy link
Contributor

AAEE86 commented Sep 18, 2024

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.8.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

ollama log:

time=2024-09-18T09:52:53.395Z level=WARN source=sched.go:137 msg="multimodal models don't support parallel requests yet"

Dify :

Query or prefix prompt is too long, you can reduce the prefix prompt, or shrink the max token, or switch to a llm with a larger token limit size.

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

@dosubot dosubot bot added the 🐞 bug Something isn't working label Sep 18, 2024
Copy link

dosubot bot commented Sep 18, 2024

I found some similar discussions and issues that might be relevant to your report:

  1. Adding a local ollama model error - This discussion is still open and was last updated on September 12, 2024 [1].
  2. An error occurred during credentials validation - This discussion is still open and was last updated on September 4, 2024 [2].
  3. May I ask, is the issue I mentioned a bug? How can it be resolved? - This discussion is still open and was last updated on April 24, 2024 [3].

Additionally, there was a closed issue related to parallelism:

For handling parallel requests, you can adjust the configuration options in Dify:

  • SERVER_WORKER_AMOUNT: Configure the number of API server workers.

    SERVER_WORKER_AMOUNT: <number_of_workers>
  • CELERY_WORKER_AMOUNT: Configure the number of Celery workers.

    CELERY_WORKER_AMOUNT: <number_of_workers>

These settings can help manage parallel processing more efficiently [5].

For the 'Query or prefix prompt is too long' error, consider the following optimizations:

  1. Reduce the prefix prompt length.
  2. Adjust the max_tokens parameter to a smaller value.
  3. Use a model with a larger token limit, such as GPT-4.
  4. Optimize the prompt structure by separating context, pre-prompt, history, and query [6][7][8][9].

Regarding the current state of multimodal model support, Dify supports ASR models and rich-text models up to GPT-4 specifications. The documentation lists various model providers and their supported capabilities, including speech-to-text and text-to-speech functionalities [10][11].

To continue talking to Dosu, mention @dosu.

@AAEE86
Copy link
Contributor Author

AAEE86 commented Sep 18, 2024

@dosu
Uploading pictures using a URL will prompt another error.
[ollama] Error: API request failed with status code 400: {"error":"illegal base64 data at input byte 5"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant