-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added structs and functions necessary for GPU inferencing support #8
Conversation
Wonderful work! Just some days ago I was thinking of adding exactly this, so perfect timing! In the first version of the crate I didn't wrap GPU-related code because I don't have an NVIDIA GPU myself.
Yeah, as mentioned in this issue you have to build from source with Also, since the gpu code is only available if you compiled your libs with the HAVE_CUDA flag, it is misleading if the rust code related to cuda is always available. I suggest putting it behind a crate feature flag, probably called As to the specifics of the code, I'll review it more thoroughly as soon as possible :D |
Yes, that is the one! I'll look into the feature flag and probably get that added today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay! I've been really busy these last weeks. Anyway, I only added some minor comments of some things that probably slipped your mind, totally fair.
Another thing I wanna comment on is that I couldn't help but notice the amount of times that #[cfg(feature = "cuda")]
is repeated. I think it'd be best to group things in a batch
module, and then import everything into its parent if the feature is enabled.
Maybe the cleanest way to do that would be to split recognition/mod.rs
into recognition/batch.rs
and recognition/<normal, default, singlethread, someting like that...>
, then mod.rs
would always import the normal recognizer but only import the batch recognizer if the feature is enabled. Then do a similar thing with the models, with a models/
folder, a models/mod.rs
, models/batch.rs
, and models/<normal, ... whatever you decided to name the recognizer file>
.
vosk/src/models.rs
Outdated
|
||
/// The same as [`Model`], but uses a CUDA enabled Nvidia GPU and dynamic batching to enable higher throughput. | ||
#[cfg(feature = "cuda")] | ||
pub struct BatchModel(pub(crate) NonNull<VoskBatchModel>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double space here, between struct
and BatchModel
I added the batch modules in-file to allow for backwards compatibility. If we moved the non-batch to a different file, I think users would have to change the way they imported non-batch functions. This way is a little messier, but not that much messier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the batch module part could be organized better but I have to think about it so I'll handle it myself before publishing the release that will include this.
Let me know what you think about the comment, and when that's changed I can proceed with the merge :P
Btw the clippy CI pipeline failed because of some CI issue I have to fix, (github is fetching the old vosk-sys crate, not the one you pushed) so don't worry about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good to me, merging ;)
Hey @lightningpwr28 I'm organizing and cleaning up PRS to prepare the crate for a new release and I was wondering, would you happen to have some examples of the JSON returned by |
I basically just made some classes in the style of the stuff that was always there, with the exception of
gpu_init()
andgpu_thread_init()
, which I put in the root of thevosk
crate.I know that the recommended dynamic library binaries don't support it, but I need the support for the back end of a project I'm working on.
I think it should be possible to get binaries that do by following the compilation steps of or copying the compiled result out of the GPU containers in vosk-server, at least for Ubuntu based distros. I'm working on a Windows binary, but I'm not entirely sure I'll be able to do it.