-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose cloud and edge inference speed metrics #173
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general comment, in the past, we have been asked by @bibireata and other people to remove these performance metrics from the response of cloud inference. I believe we should not expose this for cloud inference, maybe.
landingai/predict.py
Outdated
# performance_metrics keeps performance metrics for the last call to _do_inference() | ||
performance_metrics: Dict[str, int] = {} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why store it as a global variable? Why not associate it with a Predictor
instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my brain was running low on creativity 😞 ... I changed to a private variable to the class...thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the comment about performance metrics I think in the long term we need a tutorial on how to optimize models for speed and how to use metrics to profile and find problems. For now they are a bit obscure but at least now we expose them as part of the predict class 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This change helps users get a breakdown of the time spent during inference. This is useful to experiment with different inference pipelines.