-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Have had a few users report overunderestimations from dlcalc. upon inspection a common cause is very low compute intensity workloads (tiny models, improper performance tuning, etc.). These workloads can have tiny GEMMs with compute util as low as 30% and be heavily CPU bounded (GPU spends a lot of time idle waiting for kernels to be scheduled), neither of which are modeled by dlcalc.
dlcalc should have some heuristic for detecting these kinds of issues and issuing warnings.
separately, need to find ways to communicate to users that dlcalc is supposed to calculate theoretical performance, not predict the throughput of the (potentially inefficient) implementation from the user.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request