-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantizelinear nearbyint fix #3819
Conversation
This build is OK for merge ✅ |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3819 +/- ##
========================================
Coverage 92.35% 92.35%
========================================
Files 519 519
Lines 22307 22311 +4
========================================
+ Hits 20601 20606 +5
+ Misses 1706 1705 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides a typo, looks good to me. Thanks for doing this.
@@ -1,7 +1,7 @@ | |||
/* | |||
/*rby |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo
for(int i = 0; i < xv.size(); ++i) | ||
{ | ||
double quantized = xv.at(i) / sv.at(i); | ||
quantized = std::max(static_cast<double>(min_value), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you should saturate quantized
+ zero_pts
. Else that sum will overflow in the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make a followup PR with this
Fixes for issues that weren't addressed in #3819
Fixes for issues that weren't addressed in #3819
Fixes #2661.
For floating point types do not use the nearbyint operation.
Updates tests to catch this error and fixes gold values.
I tested and it fixes the issue seen on https://github.com/ROCm/rocMLIR-internal/issues/1718