-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29307: Incorrect split calculation causing less container to launch #6170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
CC @abstractdog , can you please provide your thoughts on this? |
| splits = | ||
| inputFormat.getSplits( | ||
| jobConf, | ||
| numSplits.orElse((int) Math.min(availableSlots * waves, Integer.MAX_VALUE))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not simply (int) (availableSlots * waves) as done in SplitGrouper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it can be done and I'm also in favour of that. It just that, multiplyExact protects from overflow hence kept it. will update it after CI outcome.
|
This fix makes sense to me overall — split waves is indeed a float, so truncating it to an int is clearly a bug. It’s worth noting that waves is buried deep inside the split generation logic, making it an expert-level setting. So it’s not surprising that this issue hasn’t been fixed until now (based on customer escalation history). mm_all.q.out change also makes sense, it's the opposite of that was done in HIVE-19703 I wish we have a much cleaner unit test to reflect how these settings affect the number of splits, but it's not necessarily the scope of this PR |
Thanks for the reply @abstractdog . There is 1 UT failure |
d58e22e to
ade78a3
Compare
|



What changes were proposed in this pull request?
Plesae check HIVE-29307
Why are the changes needed?
(int) wavesis converting 1.7 (default) => 1 causing less number of container to launch.Does this PR introduce any user-facing change?
NO
How was this patch tested?
In Prod scenario