-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
captioning + dataset preparation + inference + improvements #34
base: main
Are you sure you want to change the base?
Conversation
@a-r-r-o-w this is ready for reviews. |
awesome, testing now! |
Good to take note of by the way: https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/main/docs/Report-v1.3.0.md#captioning |
I get the following error when launching. Any idea why? stacktrace
|
It's a permission denied error, what can I do to sort your permissions? :P |
I can lift it either because I don't have |
I'm looking through the vLLM docs to see if they have an environment variable I could configure to use different cache dir, but if not then will ask in infra. Thanks! |
Seems like adding a |
I will try a few models to see what works best as default. I personally preferred the outputs of MiniCPM a lot, but will also give Qwen 7b a try. Currently, getting descriptions like:
Does not seem to be respecting the 120 token limit set when lauching 🤔 |
Take note of the following from #34 (comment)
That is likely a issue for Configuration-wise, we can experiment but I expected code-related comments as the first set of comments. |
Known gotchas:
recaption.py
as needed to suit your needs.limit_mm_per_prompt
needs adjustment based on the model being selected.After I ran
launch.sh
, I got: